App. No. 09/882,076 
Amendment Dated: July 21, 2005 
Reply to Office Action of April 21, 2005 

REMARKS/ARGUMENTS 

In the Office Action mailed April 21, 2005, claims 1-23 were rejected under 35 U.S.C. 
103(a) as being unpatentable over Cooper et al. (U.S. Patent No. 6,829,713) in view of Fruehling 
et al (U.S. Patent No. 6,625,688). No claims have been added, canceled, or amended. Claims 
1-23 remain pending. 

I. Claim Rejections under 35 U.S.C. § 103(a) 

Claims 1-23 stand rejected under 35 U.S.C. § 103(a) as being unpatentable over Cooper 
et al (US Patent No 6,829,713 B2) in view of Fruehling et al. (US Patent No 6,625,688 Bl). 

Applicant respectfully traverses the Examiner's rejections under 35 U.S.C. § 103(a), on 
the grounds that Cooper et al is not prior art under 35 U.S.C. § 102(e). The effective date of 
Cooper et al is December 30, 2000. However, as detailed in the attached 37 CFR 1.131 
declaration (Exhibit A) and its attachments (Exhibits B and C), the present invention was 
reduced to practice prior to this date. Consequently, Cooper et al is not prior art under 35 U.S.C. 
§ 102(e), and therefore, cannot be combined with other art under 35 U.S.C. § 103(a) to reject the 
claims. For this reason, applicant respectfully requests reconsideration of the 35 U.S.C. § 103(a) 
rejections. Withdrawal of these rejections is respectfully requested. 

In view of the foregoing amendments and remarks, all pending claims are believed to be 
allowable and the application is in condition for allowance. Therefore, a Notice of Allowance is 
respectfully requested. Should the Examiner have any further issues regarding this application, 
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the Examiner is requested to contact the undersigned attorney for the applicant at the telephone 
number provided below. 



Respectfully submitted, 



MERCHANT & GOULD P.C. 




Lawrence E. Lyckgf 
Registration No. 38,540 
Direct Dial: 206.342.6215 



MERCHANT & GOULD P.C. 
P. O. Box 2903 

Minneapolis, Minnesota 55402-0903 
206.342.6200 
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S/N 09/882,076 PATENT 
TM TWR I TNTTED STATES PATEN T AMD TRADEMARK OFFICE 

Applicants Stephane G. Plante et al. Examiner. Suresh Suryawanshi 

AppUcationNo.: 09/882,076 Group Art Unit 2115 

Filed: June 15, 2001 Docket No.: S0037.08US01 

Title: METHOD AND SYSTEM FOR USING IDLE THREADS TO ADAPTTVELY 

THROTTLE A COMPUTER 



FYwrnrr a . 

Tff^.AW ATION UNDER 37 CFR 81.131 

i 

We, Stephane G. Plante, John D. Vert, and Jacob Oshins, declare as follows: 

1. We are joint inventors named on U.S. Patent Application Serial No. 
09/882,076 filed June 1 5, 2001 (hereinafter, "this application"). 

2. I am aware that a Final Office Action was mailed in this application on 
April 21, 2005 and that, in this Final Office Action, all pending claims were rejected 
either as being obvious under 35 USC § 103(a) in view of U.S. Patent No. 6,829,713 B2 
Cooper et al. (hereinafter, "Cooper") and further in view of U.S. Patent No. 6,625,688 Bl 
Fruehling et al. (hereinafter, "Frueling"). 

3. I am aware of an Amendment in the present application being filed in 
response to this subsequent Office Action and that tins declaration is attached to that 

Amendment as Exhibit A. 

4. The invention set forth in all claims submitted in this application, whether 
pending, original or previously amended, was conceived and actually reduced to practice 
by us in this country at least prior to December 30, 2000. Exhibit B, attached hereto, is a 
document that lists versions dating from September 3, 2000 to September 25, 2000 and 
describes adaptive throttling. The document includes a general description of adaptive 
throttling as well as including computer-executable algorithms (e.g., see algorithm on 
pages 8-9) that illustrate the constructive reduction to practice of the invention. This 
document therefore contains a dated description that evidences conception and reduction 
to practice of the claimed invention at least as early as September 3, 2000. 
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5. As further evidence of conception and reduction to practice Exhibit C, 
attached hereto, illustrates a redlined updated version of the document in exhibit B we 
made on October 29, 2000. As evidenced by the actual working computer-executable 
algorithms provided, this document further illustrates the conception and reduction to 
practice of the invention at least prior December 30, 2000. 

6. We hereby declare that all statements made herein of our own knowledge 
ate true and that all statements made on information and belief are believed to be true; 
and further that statements are made with the knowledge that willful false statements and 
the like so made are punishable by fine or imprisonment, or both, under Section 1001 of 
Title 18 of the United States Code and mat such false statements may jeopardize the 
validity of the application or any patent issued thereon. 

n *e3iK 20 K , 2QQ S- -" ^a/oW 0^ 



G.Plante 



Date — 

John D. Vert 



Date 



Jacob Osbins 
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S/N 09/882,076 raicwi 
TN THE UNITED STATES PATENT AND TRADEMARK OFFICE 
Applicants Stephane G. Plante et al. Examiner: Suresh Suryawanshi 

Application No.: 09/882,076 Group Art Unit: 2115 

Filed: June 15, 2001 Docket No.: 50037.08US01 

Title: METHOD AND SYSTEM FOR USING IDLE THREADS TO ADAPTIVELY 

THROTTLE A COMPUTER 

EXHIBIT A 
DECLARATION UNDER 37 CFR 61.131 

We, Stephane G. Plante, John D. Vert, and Jacob Osbins, declare as follows: 

1 . We are joint inventors named on U.S. Patent Application Serial No. 
09/882,076 filed June IS, 2001 (hereinafter, "this application"). 

2. I am aware that a Final Office Action was mailed in this application on 
April 21, 2005 and that, in this Final Office Action, all pending claims were rejected 
cither as being obvious under 35 USC § 103(a) in view of U.S. Patent No. 6,829,713 B2 
Cooper et al. (hereinafter, "Cooper") and further in view of U.S. Patent No. 6,625,688 Bl 
Fruehling et al. (hereinafter, "Frueling"). 

3. I am aware of an Amendment in the present application being filed in 
response to this subsequent Office Action and that this declaration is attached to that 
Amendment as Exhibit A. 

4. The invention set forth in all claims submitted in this application, whether 
pending, original or previously amended, was conceived and actually reduced to practice 
by us in this country at least prior to December 30, 2000. Exhibit B, attached hereto, is a 
document that lists versions dating from September 3, 2000 to September 25, 2000 and 
describes adaptive throttling. The document includes a general description of adaptive 
throttling as well as including computer-executable algorithms (e.g., see algorithm on 
pages 8-9) that illustrate the constructive reduction to practice of the invention. This 
document therefore contains a dated description that evidences conception and reduction 
to practice of the claimed invention at least as early as September 3, 2000. 
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5. 



As further evidence of conception and reduction to practice Exhibit. C, 



attached hereto, illustrates a redlined updated version of the document in exhibit B we 
made on October 29, 2000. As evidenced by the actual working computer-executable 
algorithms provided, this document further illustrates the conception and reduction to 
practice of the invention at least prior December 30, 2000. 

G, We hereby declare that all statements made herein of our own knowledge 
are true and that all statements made on information and belief are believed to be true; 
and further that statements are made with the knowledge that willful false statements and 
the like so made are punishable by fine or imprisonment, or both, under Section 1001 of 
Title 18 of the United States Code and that such false statements may jeopardize the 
validity of the application or any patent issued thereon. 



Date 



Stephane G. Plante 





Date 



Jacob Oshins 
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S/N 09/882,076 PATENT 
IN THE UNITED STATES PATENT AND TRADEMARK OFFICE 
Applicants Stephane 0. Plante et al. Examiner: SureshSuryawanshi 

Application No.: 09/882,076 Group Art Unit: 2115 

Filed: June 15, 2001 Docket No.: 50037.0SUS01 

Title: METHOD AND SYSTEM FOR USING IDLE THREADS TO ADAPTIVELY 

THROTTLE A COMPUTER 

EXHIBIT A 
DECLARATION UNDER 37 CFR SI .131 

We, Stephane G. Plante, John D. Vert, and Jacob Oshins, declare as follows: 

1 . We are joint inventors named on U.S. Patent Application Serial No. 
09/882,076 filed June 1 5, 2001 (hereinafter, "this application"). 

2. I am aware that a Final Office Action was mailed in this application on 
April 21 , 2005 and that, in this Final Office Action, all pending claims were rejected 
either as being obvious under 35 USC § 103(a) in view of U.S. Patent No. 6,S29,713 B2 
Cooper et al. (hereinafter, "Cooper") and further in view of U.S. Patent No. 6,625,68« Bl 
Fruehling et al. (hereinafter, "Frueling"). 

3. 1 am aware of an Amendment in the present application being filed in 
response to this subsequent Office Action and that this declaration is attached to that 
Amendment as Exhibit A. 

4. The invention set forth in all claims submitted in this application, whether 
pending, original or previously amended, was conceived and actually reduced to practice 
by us in this country at least prior to December 30, 2000. Exhibit B, attached hereto, is a 
document that lists versions dating from September 3, 2000 to September 25, 2000 and 
describes adaptive throttling. The document includes a general description of adaptive 
throttling as well as including computer-executable algorithms (e.g., see algorithm on 
pages 8-9) that illustrate the constructive reduction to practice of the invention. This 
document therefore contains a dated description that evidences conception and reduction 
to practice of the claimed invention at least as early as September 3, 2000. 
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5. As further evidence of conception and reduction to practice Exhibit C, 
attached hereto, illustrates a redlined updated version of the document in exhibit B we 
made on October 29, 2000. As evidenced by the actual working computer-executable 
algorithms provided, this document further illustrates the conception and reduction to 
practice of the invention at least prior December 30, 2000. 

6. We hereby declare that all statements made herein of our own knowledge 
are true and that all statements made on information and belief are believed to be true; 
and further that statements are made with the knowledge that willful false statements and 
the like so made are punishable by fine or imprisonment, or both, under Section 1001 of 
Title 18 of the United States Code and that such false statements may jeopardize the 
validity of the application or any patent issued thereon. 



Date 



Stephane G. Plante 



Date 



John D. Vert 




EXHIBIT B 



1.0 Document Overview 



1.1 Document Purpose 

The purpose of this paper is to describe a possible implementations for an Adaptive 
Throttling Policy. The intent of this document is to gather a permanent record of the 
design and thought process for patent and implementation verification purposes 

1.2 Revision History 

• V0. 1 , September 3 rd , 2000. Initial Revision 

• V0.2 5 September 6 th , 2000. Per-Processor Performance State information 

• V0.3, September 1 1 th , 2000. Review comments merged 

• V0.4, September 18 th , 2000. Thermal Integration 

• V0.9, September 19 th , 2000. Battery Integration. Document is Complete. 

• V0.91, September 25 th , 2000. Typos & Corrections 

2.0 Design Vision 

2.1 What is Adaptive Throttling? 

When a computer is running on batteries, it is not desirable to always run the CPU at its 
maximum available frequency. For example, if the computer is idle, because the user is 
reading a Microsoft Word document, running the CPU at full frequency merely drains the 
battery much more quickly. 

Adaptive Throttling is the idea that the CPU should run at the maximum frequency 
required to fulfill the user's current needs. For example, while the user is reading the 
Microsoft Word document, the CPU should be throttled to its lowest possible frequency 
to save power. As soon as the user hits the page-down key or does anything else that 
requires the CPU, the CPU should be throttled back up to the frequency that places the 
CPU closest to being 100% busy as possible. 

In practice, the system should only pick the highest-throttle for each of the voltage states 
supported. The reason being that if the CPU is idle enough of the time, it will spend a 
large portion of time in the C2 state, which effectively means that it has been throttled to 
the correct level. Since Microsoft has invested a great of energy into getting the C-State 
algorithms correct, this implementation should leverage that. 

In the code, this is referred to as PO_THROTTLE_ADAPT I VE. 

2.2 What is Degraded Throttling? 

Degraded Throttling is a subset of Adaptive Throttling. The difference is that Adaptive 
Throttling does not put a cap on the maximum frequency that can be selected whereas 
Degraded Throttling does. This is useful in enforcing a policy where the user is willing to 
trade away some performance for longer battery life. It is particularly useful in situations 



where the CPU is stuck "busy- waiting". Typically the Operating System should begin by 
placing the cap at the lower-voltage, highest-throttle state and decreasing to lower throttle 
states as battery capacity diminishes. 

As an example, when the user hits the page-down key, the CPU might revert to 50% of 
its maximum frequency while the next part of the document is read from the disk and the 
screen is re-drawn. This might take a little longer than if the CPU had reverted to 100%, 
but it assumed that there would some saving in run the processor at a lower frequency for 
a longer period of time. 

In the code, this is referred to as PO_THROTTLE_DEGRADE. 

2.3 What is Constant Throttling? 

Constant throttling is a subset of Adaptive Throttling and is very similar to Degraded 
Throttling. They both start at the lowest- voltage highest-frequency state, but unlike 
Degraded Throttling, Constant Throttling will never force the maximum throttle to goto a 
lower frequency state. The throttle is actually allowed to go to a lower frequency if so 
desired, but not forced to do so. 

In the code, this is referred to as a PO_THROTTLE_CONSTANT. 

2.4 Why Implement Adaptive Throttling? 

We currently handle a few scenerios badly. And OEM perception, with the help of AMD 
and Intel, is that there are other scenarios that we handle badly which we think we handle 
well, which leads to them shipping their own drivers and crapplets. 

• Machine is mostly or completely idle: We currently handle this well. We put the 
CPU into C3 via the SLP# signal. The CPU is as deeply asleep as it would be in 
SL 

• Machine is used to play Ms. PacMan, or any other app that eats some but not all 
of the CPU bandwidth. We currently handle this poorly. We put the CPU into CI 
whenever we hit the idle loop, meaning that the CPU consumes quite a bit of 
power. Part of what makes this scenario tricky is that we don't know if the app 
will gracefully degrade if we take away CPU bandwidth. We should handle this 
scenario at first by trying to match CPU performance with CPU bandwidth. We 
can make the CPU bandwidth degrade over time if the Degrade policy is chosen. 

• Machine is running with apps consuming all CPU. We handle this poorly. The 
CPU stays in CO. Out attitude in the past has been to find these apps and get them 
to fix it. Unfortunately, this attitude hasn't paid off. 

• Machine is being used to play a DVD. We handle this poorly. The CPU stays in 
CO. The scenario is worthy of mention because it has special attributes. First, we 
know that restricting the CPU bandwidth will result in degraded performance, but 
the app will still run. Second, we could potentially find out how long the DVD is, 
allowing us to know how much total performance is needed. Third, on the laptops 
that we have looked at, DVD playback tends not to use all available CPU 
bandwidth. 



2.5 Hardware Issues 

Some concerns regarding the current and future generation of processors are important 
considerations. 

• The hardware latency for changing the voltage is potentially long enough that it 
can cause CPU-availability problems. For example, a soft-modem can drop a 
connection if it isn't serviced every 10ms or that it will never make a connection 
if it isn't serviced every 2ms during the "training phase" (at the beginning of the 
call). Intel's best-case voltage switching time is around 2ms. Dell machines seem 
to take 24 retries, which brings them into the > 50ms range. AMD is currently at 
200us, with an occasional retry. Transmeta claims to be at 20us 

• The hardware latency for changing the CPU frequency without changing the 
voltage is around 3us. Unfortunately, we don't have a way of telling the kernel 
that the latency for entering a state is large if your're coming from a different 
voltage but tiny if you're staying with the same voltage. We could make this 
assumption directly in the kernel. 

• CPU frequency is adjusted by causing the chipset to deassert the CLK_RUN# 
signal N out of M cycles, where M is usually 8. Deasserting CLK_RUN# is also 
what happens when we hit the C2 idle state 

• Using the C3 idle state consumes significantly less power than the C2 power state. 
Throttling while using C3 causes the CPU to run longer, effectively putting it in 
C2 for part of the time that it could be in C3. This means that throttling while 
using C3 wastes power. IBM and others have shown us empirical data to support 
this. 

2.6 Integration Issues 

There are several minor issues that must be handled for the system to perform optimally. 

• The system should return to highest-frequency highest- voltage when beginning 
any sort of power management operation (Sleep or Hibernate). This is essential 
during Hibernate to ensure that writing the hibernate file does not become a CPU- 
bound operation. It also insures that if the machine is transitioning to a Sleep or 
Hibernate state due to battery considerations, that the machine spends the minimal 
amount of time to make the transition. Just before entering the Sleep state, the 
processors should be returned to the lowest- voltage state possible. 

• The implementation should respect the result of the thermal policy manager. If the 
thermal policy forces the throttle to be reduced, the Adaptive Throttling manager 
should not increase the throttle past that point 

• The Operating System will perform better if we return to the lowest- voltage 
highest frequency state during any period of heavy C3 activity. 

2.7 Time Management 

For ease of integration with the existing Idle Promotion code base, all time units will be 
kept track of in terms of TickCounts. These are the units used in Prcb->KernelTime, 
Prcb->UserTime, and Thread->KernelTime. The following define will be used 
to establish the current system time: 

#define CUR TIME(X) (X~>KernelTime + X-MJserTime) 



Where X represents a pointer to the current PRCB. The reason that we take in a pointer to 
the PRCB is that it is more efficient to get the PRCB once and then kept track of it in a 
local variable, even though KeGetCurrentPrcb ( ) is expanded into an in-line call. 

3.0 Data Structures 

3.1 PROCESSOR_PERF_STATE 

This data structured is defined in ntos\pop.h. This structure replaces (in the kernel only) 
PROCESSOR_PERF_LEVEL. 



typedef struct 


{ 






UCHAR 


PercentFrequency; 


// 


max = 100 


UCHAR 


MinCapacity; 


// 


Percentage 


USHORT 


Power; 


// 


milliwatts 


UCHAR 


IncreaseLevel ; 


// 


goto higher freq 


UCHAR 


DecreaseLevel ; 


// 


goto lower freq 


USHORT 


Flags; 


// 


Used for Flags 


ULONG 


IncreaseTime; 


// 


goto higher freq 


ULONG 


DecreaseTime; 


// 


goto lower freq 


ULONG 


IncreaseCount ; 


// 


goto higher freq 


ULONG 


DecreaseCount ; 


// 


goto lower freq 


ULONGLONG 


Perf ormanceTime ; 


// 


for tick count 



} PROCESSOR PERF STATE, *PPROCESSOR PERF STATE 



The kernel will allocate an array of these structures and store the pointer to the array in 
the Processor's PRCB. 

The following Flags are defined as being available: 

#define POP_THROTTLE_NON_LINEAR 0x1 

PercentFrequency is the normalized representation of frequency that this 
performance state represents. The highest performance state has a frequency of 100%, if 
it is available. This avoids the problem of dealing with faster and faster CPUs. Under this 
mechanism, a CPU that has a max speed of 450MHz uses the same algorithm as one that 
runs at 700Mhz. 

MinCapacity is used to represent the minimum battery remaining capacity that is 
required for the CPU to be in this state. It should be noted that this only applies if the 
machine is running on DC since the relevant throttling policies are only available then. 
This value is expressed as a percentage. 

IncreaseLevel and DecreaseLevel are the boundaries of the bucket that defines 
the current state. If the CPU is busier than IncreaseLevel, the Operating System 
should pick a higher processor frequency. If the CPU is less busy than 
DecreaseLevel, the OperatingSystem should pick a lower processor frequency. 



It should be noted that PercentFrequency will be higher than IncreaseLevel 
since the only way to reach PercentFrequency is for the Operating System to be 
100% busy for the given frequency. That is, the CPU must be using every single cycle 
allocated to it in order to be running at PercentFrequency level of business. In order 
to allow for promotion in cases where the system is not quite at that level of business, 
IncreaseLevel must be a smaller value than PercentFrequency. If 
IncreaseLevel is higher than PercentFrequency, then promotion can never 
occur. 

IncreaseCount and DecreaseCount are used to keep track of the number of 
transitions from this state to another performance state. Transitions to a lower 
performance state cause an increase in IncreaseCount. Transitions to a higher 
performance state cause an increase in DecreaseCount. 

Per f ormanceTime is used to keep trace of the number of ticks that are spent at this 
performance level. This value is updated whenever the processor switches to a different 
performance state. When the user queries the performance information, the current 
elapsed time at this state is added to Perf ormanceTime for the current performance 
state. 

3.2 PROCESSOR_POWER_STATE 

This structure is defined in sdkinc\ntpoacpi.h. This structure already exists but must be 
grown to accommodate more information. Changed elements are listed in red. 
typedef struct { 

PPROCESSOR_IDLE_FUNCTION IdleFunction; 

ULONG IdleOKernelTimeLimit; 

ULONG IdleOLastTime; 

PVOID IdleState; 

ULONGLONG LastCheck; 

PROCESSOR_IDLE_TIMES IdleTimes ; 

ULONG IdleTimel; 

ULONG PromotionCheck; 

ULONG IdleTime2; 

UCHAR CurrentThrottle ; 

UCHAR ThermalThrottleLimit; 

UCHAR Sparcl [2] ; 

UCHAR ThermalThrot tie Index; 

UCHAR CurrentThrottlelndex; 

ULONG Sparc2 [2] ; 

ULONG Perf SystemTime ; 

ULONG Perf IdleTime; 

// Temp for debugging... 

ULONGLONG DebugDelta; 

ULONG DebugCount; 

ULONG LastSysTime; 

ULONGLONG TotalldleStateTime [ 3 ] ; 



ULONG 

KTIMER 

KDPC 



UCHAR 
UCHAR 
UCHAR 
UCHAR 
ULONG 



ULONG 

ULONG 



ULONGLONG 



LARGE INTEGER 



TotalldleTransitions [3] ; 

Sparc3 ; 

PreviousC3StateTime; 
KneeThrot tie Index; 
ThrottleLimit Index ; 
Perf State sCount; 
Sparel [1] ; 
Flags; 

Perf Count er Frequency ; 
PerfTickCount; 
Perf Timer ; 
Perf Dpc; 



PPROCESSOR__PERF__STATE Perf States; 
PSET_PROCESSOR_THROTTLE Perf SetThrottle ; 
} PROCESSOR_POWER_STATE, *PPROCESSOR_POWER_STATE ; 

The Per f Dpc and Per f Timer variables are convenient storage areas for the context 
information that will be required to fire a periodic timer to make sure that the CPU is not 
too busy for its current performance level. The Flags field will be used to store useful 
state information. The only useful defines (as of this document) are: 

#define PSTATE_ADAPTIVE_THROTTLE 0x1 

#define PSTATE_DEGRADED_THROTTLE 0x2 

#define PSTATE_CONSTANT_THROTTLE 0x4 
It should be noted that P S T ATE_ADAPT I VE_THROTT LE is what turns on the entire 
throttling behavior and that the other flags are used to modify the behavior. 
PSTATE_DEGRADED_THROTTLE and PSTATE_CONSTANT_THROTTLE are also 
mutually exclusive flags. 

The Per f SystemTime and Per f IdleTime structure are used to keep track of 
previous values to calculate the important time deltas. Perf SystemTime is used to 
store the amount of system time previously elapsed, as expressed by 
CUR_TIME(Prcb ) . Per f IdleTime is used to store the amount of time that was 
spent in the idle thread as expressed by Thread->KernelTime. 

Per f States is a pointer to the PROCESSOR_PERF_LEVEL array associated with this 
processor. Each processor must have unique copy of this structure to maintain per- 
processor information about the amount of time spent in each state. 
Perf StatesCount is the number of elements within this array. 

CurrentThrottlelndex is the index in the Perf States array that has currently 
been selected. This has been done using a for loop to find the entry in the 
Perf States array that corresponds to CurrentThrottle. 

KneeThrottlelndex is the index in the Per f States array that represents the 
lowest-voltage highest frequency part of the curve. This value is pre-calculated since the 
Degraded and Constant Throttling policies depend upon it. 



ThrottleLimitlndex is the index in the Perf States array that represents the 
maximum state that is acceptable under the current policy. The maximum of this value is 
the KneeThrot tie Index. This value is modified by the Kernel's Battery subsystem 
whenever it receives new battery capacity remaining information. 

ThermalThrottlelndex is the index in the Perf States array that represents the 
maximum state that is acceptable based upon the thermal throttle. This value is used 
under all policies. 

PreviousC3StateTime was the amount of time spent at C3 during the last 
successful throttle check. We need this time to compute the delta amount of time spent at 
C3 during the previous interval and thus determine the percentage time we spent at C3. 

Perf TickCount is the CUR_TIME (Prcb) when the processor switches to a new 
performance state. When the processor switches to a new performance state, the 
CUR_TIME (Prcb) minus Perf TickCount is added to the Perf ormanceTime 
bucket for the previous performance level. 

Perf CounterFrequency is the stored value of 

KeQueryPerf ormanceCounter ( ) to obtain the frequency rating of the counters. 
This is required since we want to minimize the number of calls we make and it's a good 
idea to cache this number. Perf CounterFrequency is used to make the transitions 
between values which are stored in PerformanceCounter units and those that are stored in 
Tick Counts. 



Perf SetThrottle is the function that will get called when the Operating System 
wants to set a new throttle. The reason that this is in this structure instead of a global 
variable is that we want to properly synchronize access to it. 

4.0 Algorithms 



4.1 Setting a Throttle Level 

The following code will be used to set a particular processor to a specific thermal level 



VOID 

FASTCALL 
PopSetThrottle ( 

PPROCESSOR_POWER_STATE 

P PROCE S S OR_PERF_S TATE 

ULONG 

ULONG 

ULONG 

) 



PState, 

PerfStates, 

Index, 

SystemTime, 

IdleTime 



NTSTATUS 
UCHAR 



Status; 
Current 



= PState->CurrentThrottleIndex; 



// 

// Actually set the processor to the new throttle level 
// 

Status = PState->PerfSetThrottle ( 

Perf States [ Index] . PercentFrequency 
); 

if ( !NT_SUCCESS (Status) ) { 
// 

// If it didn't succeed, then don't update the 
// stats. 

// 

return; 

} 

// 

// Update the booking for the current state 
// 

Perf States [Current] . Perf ormanceTime += (SystemTime - 
PState->PerfTickCount) ; 

// 

// Update the current throttle information 
// 

PState->CurrentThrottle = Perf States [ Index] . PercentFrequency; 
PState->CurrentThrottleIndex - Index; 

// 

// Update our idea of what the current tick counts are 
// 

PState->Perf IdleTime = IdleTime; 
PState->Perf SystemTime = SystemTime; 
PState->PerfTickCount = SystemCount; 

// 

// Remember how much we spent in C3 at this point 
// 

PState->PreviousC3StateTime = PState->TotalIdleState [2 ] ; 

} 

This function, which will be called only on the target processor, while running either at 
DPC level or within the affinity of the target processor, will actually set the new throttle 
and update the bookkeeping. 

4.2 Busy and C3 Detection 

The following code can be called within the context of the target processor to determine 
how busy the CPU has been during the previous time period. This function would 
typically be called from the IdleThread, a DPC, or while running at DISPATCHLEVEL. 

UCHAR 

CalculateBusyPercentage ( 

PPROCESSOR_POWER_STATE PState 
) 

{ 



ULONG 
ULONG 
ULONG 



PKPRCB 
PKTHREAD 
UCHAR 
ULONGLONG 



Prcb; 

Thread; 

Frequency; 

Idle; 

Busy; 

IdleTimeDelta; 
CpuTimeDelta; 



Thread = KeGetCurrentThread ( ) ; 

Prcb = CONTAINING_RECORD ( PState, KPRCB, PowerState) ; 

IdleTimeDelta = Thread->KernelTime - PState->Perf IdleTime; 
CpuTimeDelta = CUR_TIME (Prcb) - PState->Perf SystemTime; 
Idle = (IdleTimeDelta * 100) / (CpuTimeDelta) ; 



// We cannot be more than 100% idle, and if we are then we 
// are 0% busy (by definition) , so apply the proper caps 



if (Idle > 100) { 
Return 0; 

} 

Busy = 100 - Idle; 

Frequency = (UCHAR) (Busy * PState->CurrentThrottle / 

POWER_PERF_SCALE) ; 
return Frequency; 



The Idle and Busy values represent a percentage of what the CPU was doing during the 
last interval. To simplify the math later on, these numbers are normalized against the 
current throttle value. 

For example: 

• If the CPU was 50% busy at 50%Throttle, that really means that the CPU was 
25% busy at 100% throttle. 

• If the CPU was 100% busy at 25% Throttle, that really means that the CPU was 
25% busy at 100% throttle. 

• If the CPU was 10% busy at 10% Throttle, that really means that the CPU was 
1% busy at 100% throttle. 

Similarly, the formula for detecting how much time the CPU has spent in C3 during a 
known interval is: 

UCHAR 

CalculateC3Percentage ( 

PPROCESSOR POWER STATE PState 



PKPRCB 

ULONGLONG 

ULONGLONG 



Prcb; 

CpuTimeDelta; 
C3; 



LARGE_INTEGER C3Delta; 

Prcb = CONTAINING_RECORD( PState, KPRCB, PowerState) ; 
// 

// Calculate the C3 time delta in terms of Nanosecs . 
// The formulas for conversion are taken from 
// PopConvertUsToPerf Count 
// 

C3Delta. QuadPart = PState->TotalIdleState [2 ] - 

PState->PreviousC3StateTime; 
C3Delta. QuadPart = (US2SEC * US2TIME * C3Delta . QuadPart ) / 

PS tate->Perf Counter Frequency .QuadPart; 

// 

// Now, calculate the CpuTimeDelta in terms of 

// Nanoseconds 

// 

CpuTimeDelta = (CUR_TIME (PRCB) - PState->Perf SystemTime) * 
KeTime Increment; 

// 

// Figure out the ratio of the two, and cap it 

// at 100% 

// 

C3 = C3Delta. QuadPart * 100 / CpuTimeDelta; 
If (C3 > 100) { 

Return 100; 

} 

return (UCHAR) C3; 

} 

4.3 Calculating IncreaseLevel 

To calculate what the upper bound for any PROCESSOR_PERF_STATE, the following 
rules apply: 

VOID 

CalculatelncreaseLevel ( 

PPROCESSOR_PERF_STATE CpuS tate , 
ULONG CpuStateCount 
) 

{ 

ULONG I; 

ULONG DeltaPerf; 
// 

// Optimization for case where there are no CpuStates 
// 

if (CpuStateCount == 0) { 
return; 



} 

// 



// This guarantees that we can never promote past this state 
// 

CpuState [0] . IncreaseLevel = CpuState [0] . PercentFrequency + 1; 
// 

// Calculate the increase Level 
// 

For (1=1; I < CpuStateCount; I++) { 

DeltaPerf = CpuState [ 1-1 ]. PercentFreqency - 

CpuState [I] .PercentFreqency; 
DeltaPerf *= PopPerf IncreasePercentModif ier; 
DeltaPerf /= POWER_PERF_SCALE; 
DeltaPerf += PopPerf IncreaseAbsoluteModif ier; 
If (DeltaPerf > CpuState [ I] . PercentFrequency) { 

DeltaPerf = POWER_PERF_SCALE; 

} else { 

DeltaPerf = CpuState [ I ]. PercentFrequency - 
DeltaPerf; 

} 

CpuState [I] .IncreaseLevel = (UCHAR) DeltaPerf; 

} 

} 

It should be noted that the increase level will always result in the percentage business 
required for a promotion to the next higher throttle level, regardless of whether or not a 
voltage change is required. The reason that this is the case is because it is impossible to 
actually increase more than one level using the Idle Detection algorithm previously 
presented. 

If its not desired that we should promote to a higher frequency within the same voltage 
band, than this could be accomplished by removing any non-linear states from the list of 
states or by forcing the increase level to be the same value used by the highest frequency 
state in the voltage range. 

4.4 Calculating Decrease Level 

To calculate what the lower bound for any PROCESSOR_PWER_STATE, the following 
rules apply: 

VOID 

CalculateDecreaseLevel ( 

PPROCESSOR_PERF_STATE CpuState, 
ULONG CpuStateCount 
) 

{ 

// 

// We will be required to walk the CpuState array several times 
// and the only way to safely keep track of which index we are 
// looking at versus the one we care about is to use to variable 



// to keep track of indexes. 
// 

ULONG I, J; 

ULONG DeltaPerf; 

// 

// Calculate the decrease level 
// 

for (1=0; I < CpuStateCount; I++) { 

if (I == (CpuStateCount-1) ) { 

CpuState [I] . DecreaseLevel = 0; 
Continue; 

} 

DeltaPerf = CpuState [1-1] . PercentFrequency - 

CpuState [I] .PercentFrequency; 
DeltaPerf *= PopPerf DecreasePercentModif ier ; 
DeltaPerf /= POWER_PERF_SCALE; 
DeltaPerf += PopPerf DecreaseAbsoluteModif ier ; 

if (DeltaPerf > CpuState [ I ]. PercentFrequency) { 

DeltaPerf = 0; 

} else { 

DeltaPerf = CpuState [ I] . PercentFrequency - DeltaPerf; 

} 

CpuState [I] .DecreaseLevel = (UCHAR) DeltaPerf; 

} 

// 

// We want to eliminate demotions at the same voltage 
// level, so guarantee that the decrease levels result 
// in being set to the next voltage level... 
// 

I = 0; 

while ( I < CpuStateCount ) { 
// 

// Find the next non-linear state. We assume that I 

// is currently pointing at the highest-frequency 

// state within a voltage band and we are interesting 

//in finding the highest-frequency state at the 

// next-lower voltage band. 

// 

for (J = I + 1; J < CpuStateCount; J++ ) { 

If (CpuState [J] ->Flags & POP_THROTTLE_NON_LINEAR) { 
Break; 

} 



} 

// 

// Want to find the previous state since that 
// will be the decrease limit that we will use 
// 

J— ; 

// 

// Set the decrease limit to this new level 
// 

while (I < J) { 

CpuState [I] ->DecreaseLevel = 

CpuState [J] ->DecreaseLevel; 

I++; 

} 

// 

// Skip the Jth state since it is the bottom of 

// the frequencies available for the current 

// voltage level. Note that we are skipping this 

// from I's point of view. 

// 

I++; 

} 

> 

This algorithm looks at the list of available states twice. The first time, it calculates the 
value that would be used to decrease the throttle to the next lower value. The second 
time, it calculate the value that would be used to decrease the throttle to the next lower 
non-linear state. The reason that this algorithm is used is because there are almost power 
savings to decreasing the frequency but keeping the voltage constant. Thus, any 
demotions should result in voltage changes. 

The reason that decreasing the frequency but keeping the voltage produces few power 
savings is that with an aggressive C2 policy, running the CPU at a higher frequency while 
spending lots of time in C2 is equivalent to running the CPU at a lower frequency while 
spending no time in C2. 

4.5 Calculting Increase/DecreaseTime 

VOID 

CalculatelncreaseDecreaseTime ( 
PPROCESSOR_PERF_STATE 
ULONG 

P PROCE S S OR_S T ATE_H AN DLER2 
) 

{ 

ULONG I; 



CpuState, 
CpuS tateCount , 
Perf Handler 



If (CpuStateCount == 0) { 
Return; 

} 

// 

// We can never increase from State 0 
// 

CpuState [0] . IncreaseTime = (ULONG) -1; 
// 

// Loop over the next elements... 

// 

For (I =1; I < CpuStateCount; I++ ) { 
// 

// Decrease Time of previous state 

// should be based on whether current state is 

// linear or not 

// 

CpuState [1-1] . DecreaseTime = 

Perf Handler->HardwareLatency * 10 * 
PopPerf DecreaseTime Value; 

If (CpuState [I] .Flags & POP_THROTTLE_NON_L INE AR ) 

{ 

CpuState [1-1] .DecreaseTime *= 10; 

} 

// 

// Increase Time of current state should be 
// based on whether or not the state is 
// linear or not 
// 

CpuState [I] . IncreaseTime = 

PerfHandler->HardwareLatency * 10 * 

PopPerf IncreaseTime Value; 
If (CpuState [1-1] .Flags & 

POP_THROTTLE_NON_LINEAR) 

{ 

CpuState [I] . IncreaseTime *= 10; 

} 

} 

// 

// We can never decrease from the last state 
// 

I — ; 

CpuState [I] .DecreaseTime = (ULONG) -1; 



4.6 Calculate MinCapacity 

This routine is used to determine at what levels of battery capacity we start to force 
throtte when we are running on the degraded throttle. 



VOID 

CalculateMinCapacity ( 

PPROCESSOR_PERF_STATE Perf States , 

ULONG PerfStatesCount, 

PPROCESSOR POWER STATE PState 



UCHAR I; 
UCHAR Num; 

UCHAR Total = PopPerf DegradeThrottleMinCapacity; 
UCHAR Width = 0; 

if (! PerfStatesCount) { 

return; 



for (I =0; I < PState->KneeThrottleIndex; I++) { 
// 

// Any of the steps before the knee are set to 100% 
// 

Perf States [I] .MinCapacity = 100; 

} 

// 

// Calculate the range for which we clamp down the throttle 
// 

Num = PerfStatesCount - (PState-XKneeThrottlelndex + 1) ; 
if (Num != 0) { 

Width = Total / Num; 

} 

// 

// Look at all the states from the Knee to the end. Starting 
// at the highest value, set the min capacity and subtract the 
// appropriate value to get the next min capacity. 
// 

for (I = PState->KneeThrottleIndex; I < PerfStatesCount; I++) { 
// 

// We put a floor onto how low we can force the throttle 

// down to. If this state is operating below that floor, 

// then we should set the MinCapacity to 0, which 

// reflects the fact that we don't want to degrade 

// past this point 

// 

if (PState [I] . PercentFrequency < 

PopPerf DegradeThrottleMinFrequency) { 



// 

// We modify the min capacity for the 
// previous state since- we don't ever 



// want to demote from that state. 
// Also, once we start being less than 
// the min frequency, the min capacity 
// will always be set to 0, except for 
// the last state. But this is okay since 
// we look at each state in order. We also 
// have to make sure that violate array 
// bounds, but this can only happen if 
// the perf states array is badly formed 
// or the min frequency is badly formed. 
// 

if (I != 0) { 

PerfStates [1-1] .MinCapacity = 0; 

} 

} 

PState [I] .MinCapacity = Total; 
Total -= Width; 

} 

} 

The logic behind this algorithm is that it is designed to allow the system to run at Lowest 
Voltage-Highest Frequency (henceforth the KneeState) while the battery capacity 
remaining is between 100% and PopPerf DegradeThrottleMinCapacity. Once 
the battery capacity falls below that level, the highest allowed state is the next-Highest 
Frequency available at the same voltage. When the battery capacity degrades past a 
certain amount (which is based on the PopPerf DegradeThrottleMinCapacity 
and the number of available states), the highest allowed state becomes the next-Highest 
Frequency remaining. Thus, as the battery capacity diminishes, the processor runs at a 
lower and lower frequency. 

Another feature of this algorithm is that it allows for the OS to specify what the smallest 
frequency that the system can be forced to degrade to. For example, if a system supports 
the following frequencies at the low voltage state: 50%, 40%, 30%, 20%, and 10%, then 
by setting PopPerf DegradeThrottleMinFrequency to 30% will guarantee that 
we will degrade the throttle below 30%. 

To minimize the amount of calculations we must make at run time, we can pre-calculate 
the minimum battery capacity that must be remaining to be in the indicated performance 
state. It should noted that any algorithm used must be able to handle where there are no 
more performance states after the knee state. 

The beauty of this algorithm is that is can be easily replaced. If the Performance team 
decides that a log, exp, or any other means of calculating the "curve" is more appropriate, 
then only this function needs to be changed. 



A sample equation could be: 



min Capacity = 



thresholdcapacity\currentFrequency -min Frequency ) 
(max Frequency 2 -min Frequency 1 ) 



4.7 System IdleLoop 

To prevent the system from considering the promotion and demotion of performance 
states as part of how busy the system is, it is clearly desirable to invoke the increase and 
decrease of the CPU throttles from within the idle loop. 

The basic idea of the code in the idle loop is that the code should check to see what 
percentage (normalized to POP_PERF_SCALE) of time during the last interval was spent 
running the idle thread and how much time was spent doing actual work. Once the 
percentage of actual work done is calculated, then the Operating System should look at 
the Perf States table to see if this value falls within the operating parameters for the 
current bucket. It is important to make this check first because buckets are allowed to 
overlap each other. 

If the value does not fall within the current bucket, then the operating system finds the 
closest performance state for which the value matches the parameters of the bucket. 
There is an important distinction here since by picking the nearest bucket, the operating 
system will have a tendency to not pick the highest or lower performance states unless 
there is absolutely no other choice. The advantage of this algorithm is that it will pick the 
PercentFrequency closest to the calculated value. 

It is important to note that the Idle loop runs at DPC level and within the context of the 
processor for which it is targeted. That means that the code within the Idle loop cannot 
call anything that is marked as pageable. The benefit of running at DPC level is that no 
synchronization is required if the data structures used by the Idle loop can only be 
accessed by a thread running on the same processor at DPC level. This function should 
also be called before the C-State handler is invoked. 

The algorithm would look something like this: 

VOID 

PoPerfldle ( 

PPROCESSOR POWER STATE PState; 



BOOLEAN 
BOOLEAN 
BOOLEAN 
KPRCB 

PPROCESSOR PERF STATE 



C3Forced = FALSE; 
Promoted = FALSE; 
Demoted = FALSE; 
Prcb; 

Perf States; 

TickCount; 

I; 

J; 

Current Perf State 
Freq; 
IdleTime; 
Time; 

TimeDelta; 



ULONG 
UCHAR 
UCHAR 
UCHAR 
UCHAR 
ULONG 
ULONG 
ULONG 



ULONG PerfStatesCount; 
// 

// This piece of code should actually be done in the main 
// PopIdleO or PopProcessorldle routines to save a function 
// call. However this code is included here for completeness. 
// 

Prcb = CONTAINING_RECORD( PState, KPRCB, PowerState ); 
if ( !PState->Flags & PSTATE_ADAPTIVE_THROTTLE) { 
return; 

} 

// 

// Has enough time expired? 
// 

Time = CUR_TIME ( Prcb) ; 
IdleTime = Thread->KernelTime ; 
TimeDelta = Time - PState->Perf SystemTime; 
if (TimeDelta < PopPerf TimeDelta) { 
return; 

} 

// 

// Remember what the perf states are 
// 

PerfStates = PState->Perf States; 
PerfStatesCount = PState->Perf StatesCount; 

// 

// Find the bucket with the correct frequency 
// 

CurrentPerfState = PState->CurrentThrottleIndex; 
I = CurrentPerfState; 

// 

// At this point, we need to see if the number of C3 
// transitions have exceeded a threshold value, and if 
// so, then we really need to throttle back to the 
// KneeThrottlelndex since we save more power if the 
// processor is at 100% and in C3 than if the processor 
// is at 12.5% and in C3 . 
// 

Freq = CalculateC3Frequency (PState) ; 
If (Freq >= PopPerfMaxC3Frequency) { 

// 

// Set the throttle to the lowest knee in 

// the voltage & frequency curve 

// 

I = PState->KneeThrottleIndex; 
if (CurrentPerfState > I) { 

Promoted = TRUE; 

} else if (CurrentPerfState < I) { 



Demoted = TRUE; 



} 

// 

// Remember why we are doing this 
// 

C3Forced = TRUE; 
// 

// Skip to setting the throttle 
// 

goto PoPerf IdleSetThrottle; 

} 

// 

// Calculate how busy the CPU is 
// 

Freq = CalculateldleFrequency (PState) ; 
// 

// Have we exceeded the thermal throttle limit? 
// 

If (Freq > PState->ThermalThrottleLimit ) { 
// 

// The following code will force the frequency to 

// only as busy as the Thermal Throttle Limit will 

// actually allow. This removes the need for complicated 

// algorithms later on. 

// 

Freq = PState->ThermalThrottleLimit ; 
I = PState->ThermalThrottleIndex; 

} 

// 

// Is there an upper limit to what the throttle can goto? 

// Note that because we check these after we have checked 

// the thermal limit, it means that it is not possible for 

// frequency to exceed the thermal limit that we have specified 

// 

if (PState->Flags & PSTATE_DEGRADED_THROTTLE) { 
// 

// Make sure that we don't exceed the 

// state that is specified 

// 

J = PState->ThrottleLimitIndex; 

If (Freq >= Perf States [J] . PercentFrequency) { 

Freq = Perf States [J] . IncreaseLevel; 
I = J; 

} 

} else if (PState->Flags & PSTATE_CONSTANT THROTTLE) { 



J = PState->KneeThrottleIndex; 

If (Freq >= Perf States [J] . PercentFrequency) { 

Freq = Perf States [J] . IncreaseLevel ; 
I = J; 

} 

} 

// 

// Find the processor frequency that best matches 

// the one that we have just calculated. Please 

// note that this algorithm is written in such 

// a way that I can only travel in a single 

// direction. It is possible to collapse the 

// following code down, but not without allowing 

// the possibility of I doing a "yo-yo" between 

// two states (and thus never terminating the 

// while-loop) . 

// 

if (PerfState [I] . IncreaseLevel < Freq) { 

If (I != 0) { 

Promoted = TRUE; 
I — ; 

} 

} else if (PerfStates [I] . DecreaseLevel > Freq) { 

while (1) { 

If (I==(PerfStatesCount-l) ) { 

// don't exceed the array 
break; 

} 

Demoted = TRUE; 
I++; 

If (PerfStates [I] .DecreaseLevel <= Freq) { 
break; 

} 

} 

} 

PoPerf IdleSetThrottle : 



// 

// Note we need to do this now because we don't want 



I 



// to exit this code path without having set or cancelled 
// the timer as is appropriate. The only exception to 
// this rule is in the case where the system hit the 
// C3 limit. 
// 

// Cancel the timer under the following conditions 
// 

if (I == 0) { 
// 

// We are at 100% throttle, so timer won't 

// do much of anything 

// 

KeCancelTimer (& (PState->Perf Timer ) ) ; 

} else if (PState->Flags & PSTATE_CONSTANT_THROTTLE && 
I == PState->KneeLimitIndex) { 

// 

// We are at the maximum throttle allowed 
// 

KeCancelTimer (& (PState->Perf Timer) ) ; 

} else if (PState->Flags & PSTATE_DEGRADED_THROTTLE && 
I == PState->ThrottleLimitIndex) { 

// 

// We are at the maximum throttle allowed 
// 

KeCancelTimer (& (PState->Perf Timer ) ) ; 
} else { 
// 

// No restrictions that we can think of, 

// so set the timer. Note that the semantics 

// of KeSetTimer are useful here if the 

// timer has already been set, then this 
// resets it (moves it to the non-signaled 
// state) and recomputes the period. 
// 

KeSetTimer ( 

&PState->Perf Timer, 
... , 

&Pstate->PerfGpe 
) ; 

} 

// 

// We have to make special allowances if we were forced to 

// throttle because of C3 considerations 

// 

if (!C3Forced) { 



// 

// See if enough time has expired to justify changing 



// the throttle. This code is here because certain 

// transitions are fairly expensive (like those across a 

// voltage state) while others are cheap. So the amount of 

// time required before we will consider promotion/demotion 

// from the expensive state might be longer than the 

// interval at which we run this function. 

// 

if ((Promoted && TimeDelta < Perf States [ I ]. IncreaseTime) || 
(Demoted && TimeDelta < Perf States [ I] . DecreaseTime) ) { 

// 

// We haven't had enough time in the current 

// state to justify the promotion or demotion. 

// We don't update the bookkeeping since we 

// haven't considered the current interval as 

// as "success". So, we just return. 
// 

// N.B. It is very important that we don't 

// update PState->Perf SystemTime here. If we 

// did, then it is possible that TimeDelta would 

// never exceed the thresholds required. 

// 

return; 



} 

// 

//At this point, we need to update the bookkeeping 
// 

PState->PerfIdleTime = IdleTime; 
PState->PerfSystemTime = Time; 

PState->PreviousC3StateTime = PState->TotalIdleState [2 ] ; 
// 

// Update the promote and demote count 
// 

if (Promoted) { 

Perf States [CurrentPerf State] . IncreaseCount++; 
} else if (Demoted) { 

Perf States [CurrentPerf State] . DecreaseCount++; 
} else { 

// 

// At this point, we realize that we aren't 

// promoting or demoting and all the bookkeeping 

// is in order, so the appropriate thing to do 

// is just return. 

// 

return; 



// 

// We have a new throttle. Update the bookkeeping to 
// reflect the amount of time that we spent in the 
// previous state and reset the count for the next 
// state 
// 

PopSetThrottle ( 
PState, 
Perf States, 

Time, 
IdleTime 
) ; 



4.8 Processor Perf DPC 

A desirable feature to have in an adaptive throttling mechanism is the ability to sense that 
the CPU has become 100% and that the throttle should be increased if required. The way 
to accomplish this is to schedule a periodic timer that fires if the throttle is not set to 
100%. 



It is important to note that this DPC may fire in situations where the CPU is not 100% 
busy within a given time quantum. However, since the Idle Handler resets the timer count 
every time it runs, the number of spurious calls to this routine should be small. 

It is important to note that once the DPC has been fired, there is no need to cancel the 
timer since it is not scheduled as a periodic timer. 



The sample algorithm would look like this: 



VOID 

PopPerf AdaptiveThrottleDpc ( 



IN 
IN 
IN 
IN 
) 



PKDPC Dpc, 
PVOID DpcContext, 
PVOID SystemArgumentl, 
PVOID SystemArgument2 



PKPRCB 
PKTHREAD 

PPROCESSOR_PERF_STATE 

PPROCESSOR_POWER_STATE 

UCHAR 

UCHAR 

ULONG 

ULONG 

ULONG 



Prcb; 

IdleThread; 
PerfStates; 
PState; 

CurrentPerf State; 
I; 

IdleTime; 
Time; 

TimeDelta; 



// 

// We need to fetch the PRCB and the PState structures. 

// We could easily call KeGetCurrentPrcb ( ) here, but since 

// had room for a single context, why bother making the 



* 



// inline call (which generates more code than using the 
// context field anyways) when we can simply remember it. 
// The memory for the context field is already allocated 
// anyways. 
// 

Prcb = (PKPRCB) DpcContext; 
PState = & (Prcb->PowerState) ; 

// 

// Remember what the perf states are 
// 

PerfStates = PState->Perf States; 

CurrentPerf State = PState->CurrentThrottleIndex; 
// 

// Lets see if enough kernel time has expired since 

// the last check... 

// 

Time = CURJTIME (Prcb) ; 

TimeDelta = Time - Pstate->Perf SystemTime; 
if (TimeDelta < PopPerf CriticalTimeDelta) { 
return; 

} 

// 

// How much time has expired on the Idle thread? 
// 

IdleThread = Prcb->IdleThread; 
IdleTime = IdleThread->KernelTime; 
TimeDelta = IdleTime - PState->Perf IdleTime; 
if (TimeDelta < PopPerfCriticalldleTimeDelta) { 
return; 

} 

// 

// At this point, we think that the idle thread is 
// stalled and that we need to do something fast to 
// get it moving... Like setting it to the perf state 
// that corresponds to the "knee'' of the graph 
// or a perf state higher than the current one... 
// 

if (PState->Flags & PSTATE_CONSTANT_THROTTLE ) { 
// 

// Pick the knee of the curve 
// 

I = PState->KneeThrottleIndex; 
} else if (PState->Flags & PSTATE_DEGRADED_THROTTLE) { 
// 

// Pick the maximum that we are allowed to goto 
// 

I = PState->ThrottleLimitIndex; 



} else { 



// 

// Goto 100% 
// 

I = 0; 

} 

// 

// Set the new throttle 
// 

PopSetThrottle ( 
PState, 
PerfStates, 
I, 

Time, 

IdleTime 

>; 



5.0 Initialization Changes 

5.1 Global Variable Initialization 

The following global variables must be initialized at the same time that the kernel is 
loaded. To simplify changing these values later on, the Kernel will provide some default 
values that can be overridden by the registry. 

• PopPerfTimeDelta: A value in the same units as PRCB->KernelTime that 
corresponds to the Time Delta that must have occurred before the Idle thread will 
attempt to determine how busy the CPU was during the previous interval. 

• PopPerfCriticalTimeDelta: A value in the same units as PRCB->KernelTime that 
corresponds to the Time Delta that must have occurred before the Timer DPC will 
attempt to determine how busy the CPU was during the previous interval 

• PopPerfCriticalldleTimeDelta: A value in the same units of PRCB->KernelTime 
that corresponds to the Time Delta that must have occurred for the IdleThread 
during a PopPerfCriticalTimeDelta period. If this time has not occurred, then the 
throttle will be raised by the TimerDPC. 

• PopPerflncreasePercentModifier: A value between 0 and 100 where lower means 
that overall IncreaseLevel value will be higher (and thus promotions won't occur 
as frequently) that indicates what percentage of the delta between the current state 
and the state to promote to should be used to set the promote level. A suggested 
value would be 0% 

• PopPerflncreaseAbsoluteModifier: A value between 0 and 100 where lower 
means that the overall IncreaseLevel value will be higher (and thus promotions 
won't occur as frequently) that indicates how many extra percentage points to 
remove to the promote level. It should be noted that if the value is particularly 
high, then it might not be possible to promote from this state. A suggested value 
would be 1%. 

• PopPerfDecreasePercentModifier: A value between 0 and 100 where higher 
means that overall DecreaseLevel value will be lower (and thus demotions won't 



occur as frequently) that indicates what percentage of the delta between the 
current state and the state to demote to should be used to set the demote level. A 
suggested value would be 50% 

• PopPerfDecreaseAbsoluteModifier: A value between 0 and 100 where higher 
means that the overall DecreaseLevel value will be lower (and thus demotions 
won't occur as frequently) that indicates how many extra percentage points to 
subtract to the demote level. It should be noted that if the value is particularly 
high, then it might not be possible to demote from this state. A suggested value 
would be 1%. 

• PopPerflncreaseTime Value: A value in the same units as PRCB->KernelTime 
that corresponds to the Time Delta that must have occurred before a Throttle 
Increase is considered. This value should be in multiple of PopPerfTimeDelta. 
This value may also serve a basis for calculating different time increments for 
each Throttle Step. 

• PopPerfDecreaseTimeValue: A value in the same units as PRCB->KernelTime 
that corresponds to the Time Delta that must have occurred before a Throttle 
Decrease is considered. This value should be in multiple of PopPerfTimeDelta. 
This value may also serve a basis for calculating different time increments for 
each Throttle Step. 

• PopProcPerfStateLookAsideList: This is a lookaside list that will be used to 
allocate PROCESSOR_PERF_STATE structures. 

• PopPerfDegradeThrottleMinCapacity: A value between 0 and 100 that represents 
at what point of battery capacity we will start forcing down the throttle when we 
are in the Degraded Throttling mode. For example, a value of 50% means that we 
will start throttling when the CPU reaches 50%. 

• PopPerfDegradeThrottleMinFrequency: A value between 0 and 100 that 
represents the lowest frequency that we can force the throttle down to when we 
are start the Degraded Throttling mode. For example, a value of 30% means that 
we will force the throttle below 30%. 

5.2 PolnitializePrcb() 

The PROCESSOR_POWER_STATE structure is initialized by PoInitializePrcb(). The 
following changes must occur: 

VOID 
FASTCALL 

PoInitializePrcb { 

PKPRCB Prcb 
) 

{ 

// 

// Zero power state structure 
// 

RtlZeroMemory { &Prcb->PowerState, sizeof (Prcb->PowerState) ) ; 
// 

// Initialize to legacy functions with promotion from it disabled 
// 

Prcb->PowerState. IdleOKernelTimeLimit = (ULONG) -1; 



Prcb->PowerState . IdleFunction = PopIdleO; 
Prcb->PowerState.CurrentThrottle = POP_PERF_SCALE; 

// 

// Initialize the Adaptive throttling subcomponents 
// 

KelnitializeDpc ( 

& (Prcb->PowerState.PerfDpc) , 
PopPerf Adapt iveThrottleDpc, 
NULL 

); 

KeSetTargetProcessorDpc ( 

& (Prcb->PowerState.PerfDpc) , 

Prcb->Number 

); 

KelnitalizeTimer ( 

& (Prcb->PowerState . Perf Timer) 
) ; 

} 

5.3 PopSetPerfLevels() 

The PROCESSOR_PERF_STATE structure is initialized by calls to PopSetPerfLevels(). 
The initialization can occur as before, when it was initializing 
PROCESSOR_PERF_LEVEL instead. 

NTSTATUS 

PopSetPerfLevels ( 

IN PPROCESSOR_STATE 
) 

{ 

KAFFINITY 
KIRQL 
PKPRCB 
NTSTATUS 
ULONG 
ULONG 
UCHAR 
UCHAR 
UCHAR 

PPROCESSOR_PERF_STATE 
PPROCESSOR_PERF_STATE 
PPROCESSOR_POWER_STATE 

// 

// The first step is to convert the data that was passed to us 
// PROCESSOR_PERF_LEVEL over to PROCESSOR_PERF_STATE . 
// 

if (ProcessorHandler->NumPerf States) { 
// 

// Because we are using going to allocate the Perf States 
// array first so that that we can work on it, then copy 
// it to each processor, we can do that allocation from 
// paged pool. 
// 



HANDLER2 ProcessHandler 



Processors, CurrentAf f inity; 

Oldlrql; 

Prcb; 

Status = STATUS_SUCCESS; 
i; 

PerfStatesCount = 0; 
Freq; 

KneeThrottlelndex = 0; 
ThermalThrottlelndex = 0; 
PerfStates = NULL; 
TempStates; 
PState; 



Perf StatesCount = ProcessorHandler->NumPerf States; 
Perf States = ExAllocatePoolWithTag ( 
PagedPool, 

Perf StatesCount * sizeof (PROCESSOR_PERF_STATE) , 

'sPoP' 

) ; 

if (PerfStates == NULL) { 
// 

// We can handle this case. We will set the 

// status code to an appropriate failure code 

// and we will clean up the existing processor 

// states. The reason we do that is because 

// this function only gets called if the current 

// states are invalid, so keeping the current ones 

// would make no sense. 

// 

status = STATUS_INSUFFICIENT_RESOURCES; 
goto PopSetPerf LevelsSetNewStates ; 

} 

// 

// Initialize each of the PROCESSOR_PERF__STATE entries 
// 

for (i =0; i < Perf StatesCount ; i++) { 

Perf States [i] . PercentFrequency = 

ProcessorHandler->Perf Leve [i] . PercentFrequency; 
Perf States [i] .Power = 

ProcessorHandler->Perf Level [i] . Power; 

} 

// 

// Analyze the PerfStates to determine which entries are 

// linear /non-linear 

// 

PopAnalyzePerf States ( PerfStates, Perf StatesCount ); 
// 

// Calculate the increase level, decrease level, and 

// increase/decrease time 

// 

CalculatelncreaseLevel ( PerfStates, Perf StatesCount ) ; 
CalculateDecreaseLevel ( PerfStates, Perf StatesCount ) ; 
CalculatelncreaseDecreaseTime ( 

PerfStates, 

Perf StatesCount, 

PerfHandler 

) ; 

// 

// Calculate where the Knee in the performance curve is 
// 

for (i=PerfStatesCount; i >= 1; i++) { 



if (PerfStates [i-1] .Flags & POP_THROTTLE_NON_LINEAR) { 
KneeThrottlelndex = i-1; 

} 

} 



PopSetPerf LevelsSetNewStates : 
if (PerfStates) { 
// 

//We have perf states, so remember that in our 

// capabilities 

// 

PopCapabilities.ProcessorThrottle = TRUE; 
// 

// Find the minimum throttle value >= 

// PopIdleDefaultMinThrottle and the current maximum 

// throttle value 

// 

PopCapabilities.ProcessorMinThrottle = POP_PERF_SCALE; 

PopCapabilities.ProcessorMaxThrottle = 0; 

for (i=0; i<ProcessorHandler->NumPerf States; i++) { 

Freq = Perf States->Perf Level [ i ]. PercentFrequency; 

if ((Freq < PopCapabilities . ProcessorMinThrottle) && 
(Freq >= PopIdleDefaultMinThrottle)) { 

PopCapabilities .ProcessorMinThrottle = Freq; 

} 

if ((Freq > PopCapabilities . ProcessorMaxThrottle) && 
(Freq >= PopIdleDefaultMinThrottle)) { 

PopCapabilities . ProcessorMaxThrottle = Freq; 
ThermalThrottlelndex = i; 

} 

} 

// 

// There better be SOME speed we can run at. 
// 

ASSERT (PopCapabilities . ProcessorMaxThrottle >= 
PopIdleDefaultMinThrottle) ; 

} else { 
// 

//We don't have any perf sates, so remember that in our 

// capabilities 

// 

PopCapabilities . ProcessorThrottle = FALSE; 



PopCapabilities . ProcessorMaxThrottle = POP_PERF_SCALE; 
PopCapabilities . ProcessorMinThrottle = POP_PERF_SCALE; 

} 

// 

// Initialize the PPROCESSOR_POWER_STATE for each processor 
// 

Processors = KeActiveProcessors; 
CurrentAf f inity = 1; 
while (Processors) { 

if (! (Processors & CurrentAf f inity) { 

CurrentAf f inity «= 1; 

} 

// 

// Remember that we did this processor and make sure that 
// we are actually running on that processor. This ensures 
// that we are synchronized with the DPC and idle loop 
// routines 
// 

Processors &= -CurrentAf f inity; 
KeSetSystemAf f inityThread (CurrentAf f inity) ; 
CurrentAf f inity «= 1; 

// 

// To make sure that we aren't pre-empted, we must raise 

// to DISPATCH_LEVEL 

// 

KeRaiselrql (DISPATCH_LEVEL, &01dlrql ); 
// 

// Get the PRCB and PPROCESSOR_POWER_STATE structures that 

// we will need to manipulate 

// 

Prcb = KeGetCurrentPrcb () ; 
PState = &Prcb->PowerState; 

// 

// Remember what our thermal limit is 
// 

PState->ThermalThrottleLimit = 

PopCapabilities . ProcessorMaxThrottle; 
PState->ThermalThrottleIndex = ThermalThrottlelndex; 

// 

//To get the bookkeeping to work out correctly, we will 
// set the throttle to 0% (which is not possible) , set the 
// current index to the last state, and set current tick 
// count to the current time. 
// 

PState->CurrentThrottle = 0; 
PState->PerfTickCount = Time; 
If (PerfStatesCount) { 



PState->CurrentThrottleIndex = Perf StatesCount - 1; 
} else { 

PState->CurrentThrottleIndex = 0; 

} 

// 

// Reset the Knee Index. This indicates where the knee 

//in the performance curve is 

// 

PState->KneeThrottleIndex = KneeThrottlelndex; 
// 

// Reset the Throttle Limit Index 
// 

PState->ThrottleLimitIndex = KneeThrottlelndex; 
// 

//If there are already perf states present for this 

// processor, then free them 

// 

if (PState->PerfStates) { 

ExFreeToNPagedLookasideList ( 

&PopProcPerf StateLookAsideList , 
PState->Perf States 
) ; 

PState->PerfStates = NULL; 
PState->PerfStatesCount = 0; 

} 

// 

//At this point, we have to distinguish our behavior based 

// on whether or not we have perf states 

// 

if (PerfStates) { 
// 

//We do, so let allocate some memory and make a copy 

// of the template that we already created. 

// 

TempStates = ExAllocateFromNPagedLookasideList ( 
SPopProcPerf StateLookAsideList 
) ; 

if (TempStates == NULL) { 
// 

// Not being able to allocate this structure 
// is surely fatal. The only way to get around 
// it (I think) is to break out of this case 
// and treat it as if there are no PerfStates 
// available. 
// 

statuts = STATUS INSUFFICIENT RESOURCES; 



KeBugCheckEx ( 

INTERNAL_POWER_FAILURE, 
6, 

STATUS_INSUFFICIENT_RESOURCES, 

LINE , 

0 

) ; 

} else { 
// 

// Copy the template to the one associated with 

// the processor 

// 

RtlCopyMemory ( 

TempStates, 
Perf States, 
PerfStatesCount * 

sizeof (PROCESSOR_PERF_STATES) 

) ; 

PState->PerfStates = TempStates; 
PState->PerfStatesCount = PerfStatesCount; 

} 

} 

// 

// Update the processor throttle function 
// 

if (PState->Perf States) { 

PState->PopSetThrottle = 

Proces sorHandler->Set Perf Level; 

} else { 

PState->PopSetThrottle = NULL; 

} 

// 

// Update the processor throttle (since we are already 

// running on the target processor 

// 

PopUpdateProcessorThrottle () ; 
// 

//We can now return to our previous IRQL 
// 

KeLowerIrql( Oldlrql ); 



// 

// Return to the proper affinity 
// 

KeRevertToUserAf f inityThread ( ) ; 



// 

// Return whatever status code we have 
// 

return status; 

} 

This function has been extensively changed from the existing code base. The first and 
most obvious change is the fact that we allocate per-processor perf state arrays. This 
means that we have to be able to handle the case where we cannot allocate such an array. 
The solution to this problem is to basically treat that failure as an inability of the system 
to throttle. 

The changes to this function also means that a great deal of additional intelligence will 
have to be present in PopUpdateProcessorThrottle ( ) . 

The purpose of this algorithm is to flush each processor from using the old processor 
state tables. This is accomplished by running the thread in the context of each processors. 
Since this structure is only accessed within the context of a TimerDPC and the 
IdleThread, which both run at DPC level, and thus cannot be pre-empted, this 
guarantees that neither the IdleThread nor the TimerDPC can be running. If neither 
are running, then neither are using the old copy of the structure, which means that it can 
be safely freed and the new one used in its place. 

The only potential problem with this algorithm is dealing with a potential low memory 
situation. It is not acceptable to share copies of the Perf States structure among 
multiple processors. That leaves as solutions either guaranteeing that we don't run into 
low memory problem or issuing a bugcheck if we do. To avoid running into low memory 
problems, the Perf States structures can be allocated from an N PAGE D_LOOKAS IDE 
list. A further optimization is that if the old Perf States structure is the same size or 
larger than the new one, it could be used instead. 

5.4 PopUpdateAIIThrottles() 

This function is used to update all the throttles simultaneously. 

VOID 

PopUpdateAllThrottles ( 
VOID 
) 

{ 

KAFFINITY Processors ; 
KAFFINITY CurrentAf f inity; 
KIRQL Oldlrql; 

Processors = KeActiveProcessors; 
CurrentAf f inity = 1; 
while (Processors) { 



if (Processors & CurrentAf f inity) { 



KeSetSystemAf f inityThread (CurrentAf f inity) ; 
// 

// We must call PopUpdateProcessorThrottle 

//at DISPATCH_LEVEL 

// 

KeRaiselrqK DISPATCH_LEVEL, &01dlrql ); 
PopUpdateProcessorThrottle {) ; 
KeLowerIrql( Oldlrql ); 

} 

CurrentAf f inity «= 1; 

} 

KeRevertToUserAff inityThread ( ) ; 

} 

The principle change to this code is that we no longer have an early exit case if there are 
no perf states registered. In theory, we could add a check that would look at 

PopCapabilities . ProcessorThrottle. 

The other change is that we will always call PopUpdateProcessorThrottle at 
DISPATCH_LEVEL. This will guarantee that the DPC routine will not be able to pre- 
empt the routine. 

5.5 PopUpdateProcessorThrottles() 

VOID 

PopUpdateProcessorThrottles ( 
VOID 
) 

{ 

PKRPRCB 

PPROCESSOR_PERF_STATE 
PPROCESSOR_POWER_STATE 
UCHAR 
UCHAR 
UCHAR 
UCHAR 
ULONG 
ULONG 

// 

// Get the PowerState structure from the PRCB 
// 

Prcb = KeGetCurrentPrcb () ; 
PState = &Prcb->PowerState; 

// 

// Make sure that this processor supports throttling 
// 

If (PState->PopSetThrottle == NULL) { 



Prcb; 

PerfStates; 

PState; 

I; 

Index; 

NewLimit; 

Perf StatesCount; 

IdleTime 

Time; 



Return; 



} 

// 

// Get the current information such as curren throttle, current 

// throttle index, current system time, current idle time 

// 

NewLimit = PState->CurrentThrottle; 
Index = PState->CurrentThrottleIndex; 
Time = CUR_TIME (Prcb) ; 

IdleTime = Prcb->IdleThread->KernelTime; 
// 

// We will need to refer to these frequently 
// 

PerfStates = PState->Perf States; 

Perf StatesCount = PState->Perf StatesCount ; 

// 

// If we are on AC, then we always want to run at the highest 
// possible speed. Also the same algorithm is used on DC if the 
// dynamic throttling policy is PO_THROTTLE_NONE . 
// 

if ( (PopPolicy == &PopAcPolicy) | | 

(PopPolicy->DynamicThrottle == PO_THROTTLE_NONE) ) { 

// 

//We precompute what the max throttle should be 
// 

Index = PState->ThermalThrottleIndex; 
NewLimit = Perf States [ Index] . PercentFrequency; 

} else { 
// 

//We are on DC, apply the appropriate heuristics based on 

// the dynamic throttling policy. 

// 

switch (PopPolicy->DynamicThrottle) { 
case PO_THROTTLE_CONSTANT : 

// 

// We have pre-computed the optimal point on the 

// already. So, we might as well use that. 

// 

Index = PState->KneeThrottleIndex; 

NewLimit = Perf States [Index] . PercentFrequency; 

// 

// Set the constant flag and clear the degraded flag 
// 

PState->Flags &= ~ PSTATE_DE GRADE D_THROTTLE; 

PState->Flags |= PSTATE_CONSTANT_THROTTLE; 

PState->Flags |= PSTATE_ADAPT I VE_THROTTLE ; 
break; 



case PO_THROTTLE_DE GRADE : 
// 

// We calculate the limit of the degrade throttle 

// on the fly. 

// 

Index = PState->ThrottleLimitIndex; 

NewLimit = Perf States [ Index] . PercentFrequency; 

// 

// Set the degraded flag and clear the constant flag 
// 

PState->Flags &= ~PSTATE_CONSTANT_THROTTLE; 
PState->Flags |= PSTATE_CONSTANT_THROTTLE; 
PState->Flags |= PSTATE_ADAPTIVE_THROTTLE; 
break; 

case PO_THROTTLE_ADAPTIVE: 

PState->Flags |= PSTATE_ADAPTIVE_THROTTLE; 
break; 

default : 

// not implemented 
PoPrint ( 

PO_THROTTLE, 

("PopUpdateProcessorThrottle - unimplemented" 
" dynamic throttle %d\n" / 
PopPolicy->DynamicThrottle) 
) ; 

break; 

} 

} 

// 

// Check if we are over the thermal limit. 
// 

ASSERT (PState->ThermalThrottleLimit >= 

PopCapabilities . ProcessorMinThrottle) ; 
if (NewLimit > PState->ThermalThrottleLimit ) { 

PoPrint ( 

PO_THERM, 

("PopUpdateProcessorThrottle - new throttle limit %d' 

w over thermal limit %d\n", 

NewLimit, 

PState->ThermalThrottleLimit) 
); 

NewLimit = PState->ThermalThrottleLimit ; 
Index = PState->ThermalThrottleIndex; 

} 

// 

// Apply new throttle if it has changed. 
// 

if (NewLimit != PState->CurrentThrottle) { 



PoPrint ( 

PO_THROTTLE, 

("PopUpdateProcessorThrottle - Setting CPU throttle " 
"to %d\n",NewLimit) 
) ; 

PopSetThrottle ( 
PState, 
PerfState, 
Index, 
Time, 
IdleTime 
); 

} 

} 

It should be noted that this function can only be called within the context of the target 
processor. This function does not acquire any spinlocks because it is running at 
DISPATCH_LEVEL, thus preventing the Timer DPC and the Idle Thread from running 
on this processor. 

5.5 PopApplyThermalThrottle() 

VOID 

PopApplyThermalThrottle ( 
VOID 
) 

{ 

// 

// <Code which is not relevant to this spec here> 
// 
#if DBG 

PoPrint ( 

POJTHERM, 

("Thermal - Zone %p - %s - Thermal throttle = %d.%dn", 

thermalZone, t, 

(thermalThrottle / 10), 

(thermalThrottle % 10)) 

); 

PoPrint ( 

POJTHERM, 

("Thermal - Zone %p - %s - Forced throttle = %d.%d\n", 
thermalZone, t, 
(forcedThrottle / 10), 
(forcedThrottle % 10)) 
) ; 

#endif 

// 

// Set limit on effected processors 
// 

currentAf f inity = 1; 

processors = KeActiveProcessors; 



do { 

if (-(processors & currentAf f inity) { 

currentAf f inity «= 1; 
continue; 

} 

processors &= -currentAf f inity; 
// 

// We must run on the target processor 
// 

KeSetSystemAf f inityThread (currentAf f inity) ; 
pState = & (KeGetCurrentPrcb () ->PowerState) ; 

// 

// We need to be running at DISPATCH_LEVEL to access the 

// structures referenced within the pState... 

// 

KeRaiselrqK DISPATCH_LEVEL, &01dlrql ); 
// 

// Convert throttles to processor bucket size. We need to 
// do this in the context of the target processor processor 
//to make sure sure that we get correct set of perf levels 
// 

PopRoundThrottle ( 

(UCHAR) ( thermalThrot t le/ PO_TZ_THROTTLE_SCALE ) , 

SthermalLimit, 

NULL, 

SthermalLimit Index, 
. NULL 

) ; 

PopRoundThrottle ( 

(UCHAR) (forcedThrottle/PO_TZ_THROTTLE_SCALE) , 

&f orcedLimit, 

NULL, 

&f orcedLimit Index, 

NULL 

) ; 

#if DBG 

PoPrint ( 

PO_THERM, 

("Thermal - Zone %p - %d - Thermal Limit = %d\n", 

thermal Zone, Prcb->Number , thermalLimit) 

); 

PoPrint ( 

PO_THERM, 

("Thermal - Zone %p - %d - Forced Limit = %d\n", 
thermal Zone, Prcb->Number , f orcedLimit) 
) ; 

tendif 

// 



// Figure out which one we are going to use 
// 

limit = (thermalProcessors & currentAf f inity) ? 

thermalLimit : f orcedLimit ; 
index = (thermalProcessors & currentAf f inity) ? 

thermalLimitlndex : f orcedLimitlndex; 

// 

// Next affinity mask 
// 

currentAf f inity «= 1; 
// 

// Does this processor support throttling? 
// 

if (pState->PopSetThrottle == NULL) { 

KeLowerlrqK Oldlrql ); 
continue; 

} 

// 

// Check processors limit for a change 
// 

if (limit > PopCapabilities . ProcessorMaxThrottle) { 

PoPrint ( 

PO_THERM, 

("PopThrottle: Limit (%d) > Scale (%d)\n" / 
limit, 

PopCapabilities . ProcessorMaxThrottle) 
); 

limit = PopCapabilities . ProcessorMaxThrottle; 

} else if (limit < PopCapabilities . ProcessorMinThrottle) { 

PoPrint ( 

PO_THERM, 

("PopThrottle: Limit (%d) < MinThrottle w 

"(%d)\n'\ 

limit, 

PopCapabilities .ProcessorMinThrottle) 
) ; 

limit = PopCapabilities . ProcessorMinThrottle; 

} 

if (pState->ThermalThrottleLimit != limit) { 

pState->ThermalThrottleLimit = limit; 
pState->ThermalThrottleIndex = index; 

} 

// 

// Revert back to our previous IRQL 



// 

KeLowerlrqK Oldlrql ); 
} while (processors) ; 
KeRevertToUserAf f inityThread ( ) ; 
// 

// Apply thermal throttles if necessary. Note we always do this 
// whether or not the limits were changed. This routine also gets 
// called whenever the system transitions from AC to DC, and that 
// may also require a throttle update due to dynamic throttling. 
// 

PopUpdateAllThrottles () ; 



5.6 PopRoundThrottle() 

VOID 

PopRoundThrottle ( 

IN UCHAR Throttle, 

OUT OPTIONAL PUCHAR RoundDown, 

OUT OPTIONAL PUCHAR RoundUp, 

OUT OPTIONAL PUCHAR RoundDownlndex, 

OUT OPTIONAL PUCHAR RoundUpIndex 

) 



{ 



KIRQL Oldlrql; 

PKPRCB Prcb; 

PPROCESSOR_PERF_STATE Perf States; 

PPROCESSOR_POWER_STATE Pstate; 

UCHAR Low; 

UCHAR Lowlndex; 

UCHAR High; 

UCHAR Highlndex; 

ULONG I ; 

// 

// We need 'to get this processor's power capabilities 
// 

Prcb = KeGetCurrentPrcbO ; 
PState = & (Prcb->PowerState) ; 

// 

// Make sure that we are synchronized with the Idle thread 
// and other routines that access these data structures 
// 

KeRaiselrqK DISPATCH_LEVEL, &01dlrql ); 
PerfStates = PState->Perf States; 

// 

// Does this processor support throttling? 
// 

if (PState->PopSetThrottle == NULL) { 



if (ARGUMENT_PRESENT (RoundUp) ) { 
*RoundUp = Throttle; 

if (ARGUMENT_PRESENT (RoundUpIndex) ) { 
*RoundUpIndex = 0; 

} 

} 

if (ARGUMENT_PRESENT (RoundDown) ) { 
* RoundDown = Throttle; 

if (ARGUMENT_PRESENT (RoundDown Index) ) { 
*RoundDownIndex = 0; 

} 

} 

KeLowerlrql ( Oldlrql ); 
return; 

} 

// 

// Check if the supplied throttle is out of range 
// 

if (Throttle <= PopCapabilities . ProcessorMinThrottle) { 
Throttle = PopCapabilities . ProcessorMinThrottle; 

} else if (Throttle >= PopCapabilities . ProcessorMaxThrottle) { 
Throttle = PopCapabilities . ProcessorMaxThrottle; 

} 

// 

// Initialize our search space to something reasonable 
// 

Low = High = Perf States [0] . PercentFrequency; 
Lowlndex = Highlndex = 0; 

// 

// Look at all the available perf states 
// 

for ( I = 0; I < PopPerfLevelCount; I++) { 

if ( (Low > Throttle) && 

(Perf States [I] . PercentFrequency < Low)) { 

Low = Perf States [I] . PercentFrequency; 
Lowlndex =1; 

} else if (Perf States [I] . PercentFrequency > Low) { 



Low = Per f States [I] ; 
Lowlndex = I; 



} 

if ({High < Throttle) && 

(Perf States [I] . PercentFrequency > High)) { 

High = PerfStates [I] ; 
Highlndex = I; 

} else if (PerfStates [I] .PercentFrequency < High) { 

High = SPerfStates [I] ; 
Highlndex = I; 

} 

} 

// 

// Revert back to our previous IRQL. 
// 

KeLowerlrql ( Oldlrql ); 
// 

// Fill in the pointers provided by the caller 
// 

if (ARGUMENT_PRESENT (RoundUp) ) { 
*RoundUp = High; 

if ( ARGUMENT_PRESENT (RoundUpIndex) ) { 
*RoundUpIndex = Highlndex; 

} 

} 

if (ARGUMENT_PRESENT (RoundDown) ) { 
*RoundDown = Low; 

if ( ARGUMENT_PRE SENT (RoundDownlndex) ) { 
*RoundDownIndex = Lowlndex; 

} 

} 

} 

The changes in this routine include an optimization for dealing with the case where the 
desired throttle is above/below the maximum/minimum. It also now returns the index into 
the PerfStates array that correspond with the rounded up and rounded down values. 



5.7 PopCompositeBatteryDeviceHandler() 

This routine is the one that is notified when the total battery remaining level changes. For 
adaptive throttling to work, whenever a new battery notification comes in, we need to 
update the current ThrottleLimitlndex. 

VOID 

PopCompositeBatteryDeviceHandler ( 

IN PDEVICE_OBJECT DeviceOb ject , 
IN PIRP Irp, 
IN PVOID Context 
) 

{ 

<„> 

if (NT_SUCCESS (Irp->IoStatus. Status) { 
// 

// Handle the completed request 
// 

switch (PopCB. State) { 
<...> 

case PO_CB_READ_STATUS : 
<...> 

if (Policy == &PopDCPolicy) { 
<...> 
// 

// This is kind of silly, but since we 
// want to minimize our synchronization 
// elsewhere, we have to examine every 
// processor's PowerState and update the 
// ThrottleLimitlndex on each. This 
// may eventually be the smart thing to 
// do if not all processors support the 
// same set of states. 
// 

currentAf f inity = 1; 

processors = KeActiveProcessors; 

while (processors) { 

if (! (processors & currentAf finity) ) { 

currentAf f inity «= 1; 
continue; 

} 

KeSetSystemThreadAf finity ( 
currentAf finity 
) ; 

currentAf finity «= 1; 
// 

// We need to run at DISPATCH LEVEL to 



// properly synchronize access to these 

// power structures 

// 

KeRaiselrqK DISPATCH_LEVEL, SOldlrql ); 

Prcb = KeGetCurrentPrcbO ; 
PState = & (Prcb->PState) ; 
PerfStates = PState->Perf States; 
Perf StatesCount = 

PState->PerfStatesCount; 
For (I = PState->KneeThrottlIndex; 

I < PerfStatesCount; 

I++) { 

If (Perf State [I] .MinCapacity >= 
PopCB . Status . Capacity) { 

Break; 

} 

} 

PState->ThrottleLimitIdnex = I; 
// 

// We can revert back to our previous 

// IRQL now 

// 

KeLowerlrql ( Oldlrql ); 

} 

KeRevertToUserAf f inityThread ( ) ; 



} 



<...> 
} 

<...> 



} 

} 



EXHIBIT C 



1.0 Document Overview 

1.1 Document Purpose 

The purpose of this paper is to describe a possible implementations for an Adaptive 
Throttling Policy. The intent of this document is to gather a permanent record of the 
design and thought process for patent and implementation verification purposes 

1.2 Revision History 

• V0. 1 , September 3 rd , 2000. Initial Revision 

• V0.2, September 6 , 2000. Per-Processor Performance State information 

• V0.3 5 September 1 1 th , 2000. Review comments merged 

• V0.4, September 18 th , 2000. Thermal Integration 

• V0.9, September 19 th , 2000. Battery Integration. Document is Complete. 

• V0.91, September 25 th , 2000. Typos & Corrections 

• VI .00, October 29 th , 2000. First implementation changes 

2.0 Design Vision 

2.1 What is Adaptive Throttling? 

When a computer is running on batteries, it is not desirable to always run the CPU at its 
maximum available frequency. For example, if the computer is idle, because the user is 
reading a Microsoft Word document, running the CPU at full frequency merely drains the 
battery much more quickly. 

Adaptive Throttling is the idea that the CPU should run at the maximum frequency 
required to fulfill the user's current needs. For example, while the user is reading the 
Microsoft Word document, the CPU should be throttled to its lowest possible frequency 
to save power. As soon as the user hits the page-down key or does anything else that 
requires the CPU, the CPU should be throttled back up to the frequency that places the 
CPU closest to being 100% busy as possible. 

In practice, the system should only pick the highest-throttle for each of the voltage states 
supported. The reason being that if the CPU is idle enough of the time, it will spend a 
large portion of time in the C2 state, which effectively means that it has been throttled to 
the correct level. Since Microsoft has invested a great of energy into getting the C-State 
algorithms correct, this implementation should leverage that. 

In the code, this is referred to as PO_THROTTLE__ADAPTIVE. 

2.2 What is Degraded Throttling? 

Degraded Throttling is a subset of Adaptive Throttling. The difference is that Adaptive 
Throttling does not put a cap on the maximum frequency that can be selected whereas 
Degraded Throttling does. This is useful in enforcing a policy where the user is willing to 



trade away some performance for longer battery life. It is particularly useful in situations 
where the CPU is stuck "busy-waiting". Typically the Operating System should begin by 
placing the cap at the lower- voltage, highest-throttle state and decreasing to lower throttle 
states as battery capacity diminishes. 

As an example, when the user hits the page-down key, the CPU might revert to 50% of 
its maximum frequency while the next part of the document is read from the disk and the 
screen is re-drawn. This might take a little longer than if the CPU had reverted to 100%, 
but it assumed that there would some saving in run the processor at a lower frequency for 
a longer period of time. 

In the code, this is referred to as PO_THROTTLE__DEGRADE. 

2.3 What is Constant Throttling? 

Constant throttling is a subset of Adaptive Throttling and is very similar to Degraded 
Throttling. They both start at the lowest-voltage highest- frequency state, but unlike 
Degraded Throttling, Constant Throttling will never force the maximum throttle to goto a 
lower frequency state. The throttle is actually allowed to go to a lower frequency if so 
desired, but not forced to do so. 

In the code, this is referred to as a PO_THROTTLE_CONSTANT. 

2.4 Why Implement Adaptive Throttling? 

We currently handle a few scenerios badly. And OEM perception, with the help of AMD 
and Intel, is that there are other scenarios that we handle badly which we think we handle 
well, which leads to them shipping their own drivers and crapplets. 

• Machine is mostly or completely idle: We currently handle this well. We put the 
CPU into C3 via the SLP# signal. The CPU is as deeply asleep as it would be in 
SI. 

• Machine is used to play Ms. PacMan, or any other app that eats some but not all 
of the CPU bandwidth. We currently handle this poorly. We put the CPU into CI 
whenever we hit the idle loop, meaning that the CPU consumes quite a bit of 
power. Part of what makes this scenario tricky is that we don't know if the app 
will gracefully degrade if we take away CPU bandwidth. We should handle this 
scenario at first by trying to match CPU performance with CPU bandwidth. We 
can make the CPU bandwidth degrade over time if the Degrade policy is chosen. 

• Machine is running with apps consuming all CPU. We handle this poorly. The 
CPU stays in CO. Out attitude in the past has been to find these apps and get them 
to fix it. Unfortunately, this attitude hasn't paid off. 

• Machine is being used to play a DVD. We handle this poorly. The CPU stays in 
CO. The scenario is worthy of mention because it has special attributes. First, we 
know that restricting the CPU bandwidth will result in degraded performance, but 
the app will still run. Second, we could potentially find out how long the DVD is, 
allowing us to know how much total performance is needed. Third, on the laptops 
that we have looked at, DVD playback tends not to use all available CPU 
bandwidth. 



2.5 Hardware Issues 

Some concerns regarding the current and future generation of processors are important 
considerations. 

• The hardware latency for changing the voltage is potentially long enough that it 
can cause CPU-availability problems. For example, a soft-modem can drop a 
connection if it isn't serviced every 10ms or that it will never make a connection 
if it isn't serviced every 2ms during the "training phase" (at the beginning of the 
call). Intel's best-case voltage switching time is around 2ms. Dell machines seem 
to take 24 retries, which brings them into the > 50ms range. AMD is currently at 
200us, with an occasional retry. Transmeta claims to be at 20us 

• The hardware latency for changing the CPU frequency without changing the 
voltage is around 3 us. Unfortunately, we don't have a way of telling the kernel 
that the latency for entering a state is large if your're coming from a different 
voltage but tiny if you're staying with the same voltage. We could make this 
assumption directly in the kernel. 

• CPU frequency is adjusted by causing the chipset to deassert the CLK_RUN# 
signal N out of M cycles, where M is usually 8. Deasserting CLK_RUN# is also 
what happens when we hit the C2 idle state 

• Using the C3 idle state consumes significantly less power than the C2 power state. 
Throttling while using C3 causes the CPU to run longer, effectively putting it in 
C2 for part of the time that it could be in C3. This means that throttling while 
using C3 wastes power. IBM and others have shown us empirical data to support 
this. 

2.6 Integration Issues 

There are several minor issues that must be handled for the system to perform optimally. 

• The system should return to highest-frequency highest- voltage when beginning 
any sort of power management operation (Sleep or Hibernate). This is essential 
during Hibernate to ensure that writing the hibernate file does not become a CPU- 
bound operation. It also insures that if the machine is transitioning to a Sleep or 
Hibernate state due to battery considerations, that the machine spends the minimal 
amount of time to make the transition. Just before entering the Sleep state, the 
processors should be returned to the lowest- voltage state possible. 

• The implementation should respect the result of the thermal policy manager. If the 
thermal policy forces the throttle to be reduced, the Adaptive Throttling manager 
should not increase the throttle past that point 

• The Operating System will perform better if we return to the lowest- voltage 
highest frequency state during any period of heavy C3 activity. 

2.7 Time Management 

For ease of integration with the existing Idle Promotion code base, all time units will be 
kept track of in terms of TickCounts. These are the units used in Prcb->KernelTime, 
Prcb->UserTime, and Thread->KernelTime. The following define will be used 
to establish the current system time: 

#define POP CUR TIME(X) (X->KernelTime + X->UserTime) 



Where X represents a pointer to the current PRCB. The reason that we take in a pointer to 
the PRCB is that it is more efficient to get the PRCB once and then kept track of it in a 
local variable, even though KeGe t Cur rent Prcb ( ) is expanded into an in-line call. 

3.0 Data Structures 

3.1 PROCESSOR_PERF_STATE 

This data structured is defined in ntos\pop.h. This structure replaces (in the kernel only) 
PROCESSOR_PERF_LEVEL. 

typedef struct { 



UCHAR 


Percent Frequency; 


// 


max = 100 


UCHAR 


MinCapacity; 


// 


Percentage 


USHORT 


Power; 


// 


milliwatts 


UCHAR 


IncreaseLevel ; 


// 


goto higher freq 


UCHAR 


DecreaseLevel ; 


// 


goto lower freq 


USHORT 


Flags; 


// 


Used for Flags 


ULONG 


IncreaseTime; 


// 


goto higher freq 


ULONG 


DecreaseTime; 


// 


goto lower freq 


ULONG 


IncreaseCount; 


// 


goto higher freq 


ULONG 


DecreaseCount ; 


// 


goto lower freq 


ULONGLONG 


Perf ormanceTime ; 


// 


for tick count 



} PROCESSOR_PERF_STATE, *PPROCESSOR_PERF_STATE 

The kernel will allocate an array of these structures and store the pointer to the array in 
the Processor's PRCB. 

The following Flags are defined as being available: 

# define POP_THROTTLE_NON_LINEAR 0x1 

PercentFrequency is the normalized representation of frequency that this 
performance state represents. The highest performance state has a frequency of 100%, if 
it is available. This avoids the problem of dealing with faster and faster CPUs. Under this 
mechanism, a CPU that has a max speed of 450MHz uses the same algorithm as one that 
runs at 700Mhz. 

MinCapacity is used to represent the minimum battery remaining capacity that is 
required for the CPU to be in this state. It should be noted that this only applies if the 
machine is running on DC since the relevant throttling policies are only available then. 
This value is expressed as a percentage. 

IncreaseLevel and DecreaseLevel are the boundaries of the bucket that defines 
the current state. If the CPU is busier than IncreaseLevel, the Operating System 
should pick a higher processor frequency. If the CPU is less busy than 
DecreaseLevel, the OperatingSystem should pick a lower processor frequency. 



It should be noted that PercentFrequency will be higher than IncreaseLevel 
since the only way to reach PercentFrequency is for the Operating System to be 
100% busy for the given frequency. That is, the CPU must be using every single cycle 
allocated to it in order to be running at PercentFrequency level of business. In order 
to allow for promotion in cases where the system is not quite at that level of business, 
IncreaseLevel must be a smaller value than PercentFrequency. If 
IncreaseLevel is higher than PercentFrequency, then promotion can never 
occur. 

IncreaseCount and DecreaseCount are used to keep track of the number of 
transitions from this state to another performance state. Transitions to a lower 
performance state cause an increase in IncreaseCount. Transitions to a higher 
performance state cause an increase in DecreaseCount. 

Per f ormanceTime is used to keep trace of the number of ticks that are spent at this 
performance level. This value is updated whenever the processor switches to a different 
performance state. When the user queries the performance information, the current 
elapsed time at this state is added to Per f ormanceTime for the current performance 
state. 

3.2 PROCESSOR_POWER_STATE 

This structure is defined in sdkinc\ntpoaepi.h. This structure already exists but must be 
grown to accommodate more information. The structure was removed from the published 
header file and moved to the private ntos\inc\procpowr.h header file. Changed elements 
are listed in red. 

typedef struct { 

PPROCESSOR_IDLE_FUNCTION IdleFunction; 

ULONG IdleOKernelTimeLimit; 

ULONG IdleOLastTime; 

PVOID IdleState; 

ULONGLONG LastCheck; 

PROCESSOR_IDLE_TIMES IdleTimes; 

ULONG IdleTimel; 

ULONG PromotionCheck; 

ULONG IdleTime2; 

UCHAR CurrentThrottle; 

UCHAR ThermalThrottleLimit; 

UCHAR Sparcl [2] ; 

UCHAR ThermalThrottlelndex; 

UCHAR CurrentThrottlelndex; 

ULONG Sparo2 [2] ; 

ULONG Perf SystemTime; 

ULONG Perf IdleTime; 

// Temp for debugging... 

ULONGLONG DebugDe 1 t a ; 

ULONG DebugCount; 



ULONG LastSysTime; 
ULONGLONG TotalldleStateTime [3] ; 

ULONG TotalldleTransitions [3] ; 

ULONG Sparc3 ; 

ULONGLONG PreviousC3StateTime ; 

UCHAR KneeThrottlelndex; 
UCHAR ThrottleLimit Index; 

UCHAR PerfStatesCount; 

UCHAR ProcessorMinThrottle; 

UCHAR ProcessorMaxThrottle; 

UCHAR LastBusyPercentage; 

UCHAR LastC3Percentage; 

UCHAR 

LastAdjustedBusyPercentage ; 

UCHAR Sparcl [1] ; 

ULONG Flags; 

ULONG PromotionCount; 

ULONG DemotionCount ; 

LARGE_INTEGER Per f CounterFrequency ; 

ULONG PerfTickCount; 
KTIMER PerfTimer; 
KDPC PerfDpc; 
PPROCESSOR_PERF_STATE Perf States ; 
PSET_PROCESSOR_THROTTLE Per f Se tThrottle ; 
} PROCESSOR_POWER_STATE, *PPROCESSOR_POWER_STATE; 

The PerfDpc and PerfTimer variables are convenient storage areas for the context 
information that will be required to fire a periodic timer to make sure that the CPU is not 
too busy for its current performance level. The Flags field will be used to store useful 
state information. The only useful defines (as of this document) are: 

#define PSTATE SUPPORTS THROTTLE 0x1 

#define PS TATE_ADAPT I VE_THROTTLE 0x2i 

#define PSTATE_DEGRADED_THROTTLE 0x42- 

#define P S T ATE_CON S TAN T_T HROTT LE 0x84 
It should be noted that PSTATE_ADAPTIVE_THROTTLE is what turns on the entire 
throttling behavior and that the other flags are used to modify the behavior. 
PSTATE_DEGRADED_THROTTLE and P S T ATE_CON S T ANT_T HROT T LE are also 
mutually exclusive flags. 

The Perf SystemTime and Perf IdleTime structure are used to keep track of 
previous values to calculate the important time deltas. Perf SystemTime is used to 
store the amount of system time previously elapsed, as expressed by 
CUR_TIME(Prcb) . Perf IdleTime is used to store the amount of time that was 
spent in the idle thread as expressed by Thread->KernelTime. 



Perf States is a pointer to the PROCESSOR_PERF_LEVEL array associated with this 
processor. Each processor must have unique copy of this structure to maintain per- 
processor information about the amount of time spent in each state. 
Perf StatesCount is the number of elements within this array. 

CurrentThrottlelndex is the index in the Perf States array that has currently 
been selected. This has been done using a for loop to find the entry in the 
Perf States array that corresponds to CurrentThrottle. 

KneeThrottlelndex is the index in the Perf States array that represents the 
lowest- voltage highest frequency part of the curve. This value is pre-calculated since the 
Degraded and Constant Throttling policies depend upon it. 

ThrottleLimitlndex is the index in the Perf States array that represents the 
maximum state that is acceptable under the current policy. The maximum of this value is 
the KneeThrottlelndex. This value is modified by the Kernel's Battery subsystem 
whenever it receives new battery capacity remaining information. 

ThermalThrottlelndex is the index in the Perf States array that represents the 
maximum state that is acceptable based upon the thermal throttle. This value is used 
under all policies. 

ProcessorMinThrottle and ProcessorMaxThrottle are the minimum and 
maximum frequencies available for this processor. This information used to be stored 
system-wide in the PopCapabilities, but has been moved to per-processor. In theory, it is 
possible to implement different minimum and maximum for each processors. 

LastBusyPercentaqe is used to store the last value calculated by 
PopCalculateBusyPercentaqe () for debugging purposes. 
LastC3Percentaqe is used to store the last value calculated by 
PopCalculateC3Percentaqe ( ) for debugging purposes. 

LastAdjustedBusyPercentaqe is used to store the modified busy frequency after 
all the extraneous factors have been calculated. It is useful for providing debug 
information. 

PromotionCount keeps track of all the promotions done for this processor. 
DemotionCount keeps track of all the demotions done for this processor. This allows 
for user applications to quickly see the number of transitions that have been made. 

PreviousC3StateTime was the amount of time spent at C3 during the last 
successful throttle check. We need this time to compute the delta amount of time spent at 
C3 during the previous interval and thus determine the percentage time we spent at C3. 

Perf TickCount is the CURJTIME (Prcb) when the processor switches to a new 
performance state. When the processor switches to a new performance state, the 



CUR_TIME (Prcb) minus Perf TickCount is added to the Perf ormanceTime 
bucket for the previous performance level. 

Perf CounterFrequency is the stored value of 

KeQueryPer f ormanceCounter ( ) to obtain the frequency rating of the counters. 
This is required since we want to minimize the number of calls we make and it's a good 
idea to cache this number. PerfCounterFrequencyis used to make the transitions 
between values which are stored in PerformanceCounter units and those that are stored in 
Tick Counts. 



Perf SetThrottle is the function that will get called when the Operating System 
wants to set a new throttle. The reason that this is in this structure instead of a global 
variable is that we want to properly synchronize access to it. 

4.0 Algorithms 



4.1 Setting a Throttle Level 

The following code will be used to set a particular processor to a specific thermal level 



VOID 

FASTCALL 

PopSetThrottle ( 

PPROCESSOR_POWER_STATE 

PPROCESSOR_PERF_STATE 

ULONG 

ULONG 

ULONG 

) 



PState, 

PerfStates, 

Index, 

SystemTime, 

IdleTime 



NTSTATUS 

PKPRCB 



PKTHREAD 



UCHAR 



Status; 

Prcb; 



Thread; 



Current = PState->CurrentThrottleIndex; 



PoPrint ( 

PO THROTTLE, 

"PopSetThrottle: Indexed (%d%%) at %ld (system)" 

" %ld (idle)\n", 

Index, 

PerfStates [Index] . PercentFrequency , 

SystemTime, 

IdleTime 

1 

// 

// Actually set the processor to the new throttle level 
// 

Status = PState->Perf SetThrottle ( 

PerfStates [ Index] . PercentFrequency 
) ; 

if (INT SUCCESS (Status) ) { 



// 

// If it didn't succeed, then don't update the 
// stats. 

// 

return; 

} 

// 

II Get the PRCB so that we can update the kernel and idle threads 
// The reason we do this is that a transition is usually non-zero 
// work, so we don' t want the transitions to affect the free/busy 
// calcalutations . 

UL 

Prcb = KeGetCurrentPrcb {) ; 
Thread = Prcb->IdleThread; 
SystemTime = POP CUR TIME (Prcb); 
IdleTime = Thread->KernelTime ; 
PoPrint ( 

PO THROTTLE, 

"PopSetThrottle: Indexed (%d%%) now at %ld (system)" 

" %ld (idle)\n", 

Index, 

Perf States [Index] . PercentFrequency, 

SystemTime / 

IdleTime 

I 

li 

II 

II Update the booking for the current state 
// 

Perf States [Current] . Perf ormanceTime += (SystemTime - 
PState->PerfTickCount) ; 

// 

// Update the current throttle information 
// 

PState->CurrentThrottle = Perf States [ Index] . PercentFrequency; 
PState->CurrentThrottleIndex = Index; 

// 

// Update our idea of what the current tick counts are 
// 

PState->Perf IdleTime = IdleTime; 
PState->Perf SystemTime = SystemTime; 
PState->PerfTickCount = System Time Count ; 

// 

// Remember how much we spent in C3 at this point 
// 

PState->PreviousC3StateTime = PState->TotalIdleState [2 ] ; 



This function, which will be called only on the target processor, while running either at 
DPC level or within the affinity of the target processor, will actually set the new throttle 
and update the bookkeeping. 

4.2 Busy and C3 Detection 

The following code can be called within the context of the target processor to determine 
how busy the CPU has been during the previous time period. This function would 
typically be called from the IdleThread, a DPC, or while running at DISPATCH LEVEL. 

UCHAR 

PopCalculateBusyPercentage ( 

PPROCESSOR_POWER_STATE PState 
) 



{ 



PKPRCB Prcb; 

PKTHREAD Thread; 

UCHAR Frequency; 

ULONGLONG Idle; 

ULONG Busy; 

ULONG IdleTimeDelta; 

ULONG CpuTimeDelta; 

Thread = KeGetCurrentThread ( ) ; 

Prcb - CONTAIN ING_RECORD ( PState, KPRCB, PowerState) ; 

IdleTimeDelta = Thread->KernelTime - PState->Perf IdleTime; 
CpuTimeDelta = CUR_TIME (Prcb) - PState->Perf SystemTime ; 
Idle = (IdleTimeDelta * 100) / (CpuTimeDelta) ; 

// 

//We cannot be more than 100% idle, and if we are then we 
// are 0% busy (by definition), so apply the proper caps 
// 

if (Idle > 100) { 
Return 0; 

} 

Busy = 100 - Idle; 

Frequency = (UCHAR) (Busy * PState->CurrentThrottle / 
POWER_PERF_SCALE) ; 

LL 

II Remember how busy we were. This will make debugging so much 
// easier. 

LL 

Prcb->PowerState . LastBusyPercentage = frequency; 
return Frequency; 



The Idle and Busy values represent a percentage of what the CPU was doing during the 
last interval. To simplify the math later on, these numbers are normalized against the 
current throttle value. 



For example: 

• If the CPU was 50% busy at 50%Throttle, that really means that the CPU was 
25% busy at 100% throttle. 

• If the CPU was 100% busy at 25% Throttle, that really means that the CPU was 
25% busy at 100% throttle. 

• If the CPU was 10% busy at 10% Throttle, that really means that the CPU was 
1% busy at 100% throttle. 

Similarly, the formula for detecting how much time the CPU has spent in C3 during a 
known interval is: 



UCHAR 

PopCalculateC3Percentage ( 

PPROCESSOR POWER STATE 



PState 



{ 



) 



PKPRCB 
ULONGLONG 
ULONGLONG 
LARGE INTEGER 



Prcb; 

CpuTimeDelta; 
C3; 

C3Delta; 



Prcb = CONTAIN ING_RECORD( PState, KPRCB, PowerState) ; 
// 

// Calculate the C3 time delta in terms of Nanosecs. 
// The formulas for conversion are taken from 
// PopConvertUsToPerfCount 
// 

C3Delta.QuadPart = PState->TotalIdleState [2 ] - 

PState->PreviousC3StateTime; 
C3Delta.QuadPart = (US2SEC * US2TIME * C3Delta . QuadPart) / 

PState->PerfCounterFrequency. QuadPart; 

// 

// Now, calculate the CpuTimeDelta in terms of 

// Nanoseconds 

// 

CpuTimeDelta = (CUR_TIME (PRCB) - PState->Perf SystemTime) * 
KeTime Increment; 

// 

// Figure out the ratio of the two, and cap it 

// at 100% 

// 

C3 = C3Delta. QuadPart * 100 / CpuTimeDelta; 
If (C3 > 100) { 

Return 100; 

} 

LL 

// Remember what it was this will make debugging so much 

// easier 
// 



Prcb->PowerState,LastC3Percentage = (UCHAR) C3; 
return (UCHAR) C3; 

} 

4.3 Calculating IncreaseLevel 

To calculate what the upper bound for any PROCESSOR_PERF_STATE, the following 
rules apply: 

VOID 

Pop CalculatelncreaseLevel ( 
PPROCESSOR_PERF_STATE 
ULONG 
) 

{ 

ULONG I; 

ULONG DeltaPerf; 
// 

// Optimization for case where there are no CpuStates 
// 

if (CpuStateCount 0) { 
return; 

} 

// 

// This guarantees that we can never promote past this state 
// 

CpuState [0] . IncreaseLevel = CpuState [ 0 ] . PercentFrequency + 1; 
// 

// Calculate the increase Level 
// 

For (1=1; I < CpuStateCount; I++) { 

DeltaPerf = CpuState [ 1-1 ]. PercentFreqency - 

CpuState [I] .PercentFreqency; 
DeltaPerf *= PopPerf IncreasePercentModif ier ; 
DeltaPerf /= POWER_PERF_SCALE; 
DeltaPerf += PopPerf IncreaseAbsoluteModif ier ; 
If (DeltaPerf > CpuState [I] . PercentFrequency) { 

DeltaPerf = POWER PERF SCALE + 1; 



} else { 

DeltaPerf = CpuState [ I ]. PercentFrequency - 
DeltaPerf; 

} 

CpuState [I] .IncreaseLevel - (UCHAR) DeltaPerf; . 



CpuState, 
CpuStateCount 




It should be noted that the increase level will always result in the percentage business 
required for a promotion to the next higher throttle level, regardless of whether or not a 
voltage change is required. The reason that this is the case is because it is impossible to 
actually increase more than one level using the Idle Detection algorithm previously 
presented. 

If its not desired that we should promote to a higher frequency within the same voltage 
band, than this could be accomplished by removing any non-linear states from the list of 
states or by forcing the increase level to be the same value used by the highest frequency 
state in the voltage range. 

4.4 Calculating DecreaseLevel 

To calculate what the lower bound for any PROCESSOR_PWER_STATE, the following 
rules apply: 

VOID 

Pop CalculateDecreaseLevel ( 

PPROCESSOR_PERF_STATE CpuSta te , 
ULONG CpuStateCount 
) 

{ 

// 

//We will be required to walk the CpuState array several times 
// and the only way to safely keep track of which index we are 
// looking at versus the one we care about is to use to variable 
// to keep track of indexes. 
// 

ULONG I, J; 
ULONG DeltaPerf; 

// 

I / Sanity Check 

u_ 

if (CpuStateCount ==0) { . 
return ; 

I 
// 

1 1 Set the decrease value for the last element in the array 

u_ 

CpuState [CpuStateCount-1] . DecreaseLevel = 0; 
// 

// Calculate the decrease level 
// 

for (I_=_0_;I < (CpuStateCount - 1) ; I++) { 



if (I — (CpuStateCount 1) ) — f- 

CpuState [I] .DecreaseLevel - 0; 
Continue ; 



DeltaPerf = CpuState [I— 1-] . PercentFrequency - 

CpuState [1+1] .PercentFrequency; 
DeltaPerf *= PopPerf DecreasePercentModif ier ; 
DeltaPerf /= POWER_PERF_SCALE; 
DeltaPerf += PopPerf DecreaseAbsoluteModif ier ; 

if (DeltaPerf > CpuState [1+1] . PercentFrequency) { 

DeltaPerf = 0; 

} else { 

DeJctaPerf = CpuState [ 1+1] . PercentFrequency - 
DeltaPerf; 
6 

} 

CpuState [I] .DecreaseLevel = (UCHAR) DeltaPerf; 

} 

// 

//We want to eliminate demotions at the same voltage 
// level, so guarantee that the decrease levels result 
// in being set to the next voltage level... 
// 

I = 0; 

while ( I < CpuStateCount ) { 
// 

// Find the next non-linear state. We assume that I 

// is currently pointing at the highest-frequency 

// state within a voltage band and we are interesting 

// in finding the highest-frequency state at the 

// next-lower voltage band. 

// 

for (J = I + 1; J < CpuStateCount; J++ ) { 

If (CpuState [J] ^->Flags & POP_THROTTLE_NON_LINEAR) { 
Break; 

} 

} 



// Want to find the previous state since that 
// will be the decrease limit that we will use 
// 

J— ; 

// 

// Set the decrease limit to this new level 
// 

while (I < J) { 



CpuState [I]— >^DecreaseLevel = 

CpuState [ J]->JDecreaseLevel; 

I++; 



// 
// 
// 
// 
// 
// 



Skip the Jth state since it is the bottom of 
the frequencies available for the current 
voltage level. Note that we are skipping this 
from I's point of view. 



I++; 



This algorithm looks at the list of available states twice. The first time, it calculates the 
value that would be used to decrease the throttle to the next lower value. The second 
time, it calculate the value that would be used to decrease the throttle to the next lower 
non-linear state. The reason that this algorithm is used is because there are almost power 
savings to decreasing the frequency but keeping the voltage constant. Thus, any 
demotions should result in voltage changes. 

The reason that decreasing the frequency but keeping the voltage produces few power 
savings is that with an aggressive C2 policy, running the CPU at a higher frequency while 
spending lots of time in C2 is equivalent to running the CPU at a lower frequency while 
spending no time in C2. 

4.5 Calculting Increase/DecreaseTime 

VOID 

Pop CalculatelncreaseDecreaseTime ( 

PPROCESSOR_PERF_STATE CpuState, 
ULONG CpuS tateCount , 

PPROCESSOR STATE HANDLER2 PerfHandler 



ULONG I; 

ULONG TickRate; 
ULONG Time; 

If (CpuStateCount == 0) { 
Return; 



// 

// We can never increase from State 0 
// 

CpuState [0] . IncreaseTime = (ULONG) -1; 



// 



// Get the current tick rate 
// 

TickRate = KeQueryTime Increment () ; 
// 

// Loop over the next elements... 

// 

For (1=1; I < CpuStateCount; I++ ) { 
// 

// Decrease Time of previous state 

// should be based on whether current state is 

// linear or not 

// 

CpuStatc [1-1] . DccrcQGc Time = 

PerfHandler->HardwareLatency * 10 * 
PopPerf DecreaseTimeValue; 

If (CpuState [I] .Flags & POP THROTTLE NON LINEAR | j 

CpuState [1-1] .Flags & POP THROTTLE NON LINEAR ) 

{ 

CpuStato [1 - 1] . Dccrcaoc Time *= 10; 

} 

Time += PopPerf DecreaseTimeValue ; 
// 

//We do have some minimums that we must respect 

LL 

If (Time < PopPerf DecreaseMinimumTime) { 
Time = PopPerf DecreaseMinimumTime ; 

\ 
j 

// 

// Time is in Micro Seconds. Need to convert to tick 

// counts. 

// 

PerfStates [I-l] . DecreaseTime = Time * US2TIME / 
TickRate + 1; 

// 

// Increase Time of current state should be 
// based on whether or not the state is 
// linear or not 
// 

CpuStato [I] . Incrcaoc Time = 

PerfHandler->HardwareLatency * 10 * 

PopPerf IncreaseTime Value; 
If (CpuState [I-l] .Flags & POP THROTTLE NON LINEAR | j 

CpuState [I] .Flags & POP THROTTLE NON LINEAR ) 

{ 

CpuStato [I] . Incrcaoc Time *= 10; 

} 

Time +- PopPerf IncreaseTimeValue; 

LL 

II We do have some minimums that we must obey 



LL 

If (Time < PopPerf IncreaseMinimumTime) { 



Time = PopPerf IncreaseMinimumTime ; 

I 

LL 

II Time is in Micro Seconds. Need to conver to tick 
// counts. 

LL 

PerfStates [I] . IncreaseTime = Time * US2TIME / 
TickRate + 1; 

} 

// 

// We can never decrease from the last state 
// 

I — ; 

CpuStatefl] .DecreaseTime = (ULONG) -1; 

} 

4.6 Calculate MinCapacity 

This routine is used to determine at what levels of battery capacity we start to force 
throtte when we are running on the degraded throttle. 

VOID 

Pop Calculate Per f MinCapacity ( 

PPROCESSOR_PERF_STATE PerfStates, 
ULONG PerfStatesCount- 

PPR0CESS0R_P0WERJ5TATE PStatc 

) 
{ 

UCHAR I; 

UCHAR Knee = 0; 

UCHAR Num; 

UCHAR Total = PopPerf DegradeThrottleMinCapacity ; 
UCHAR Width = 0; 

if ( ! Perf StatesCount) { 

return; 

} 

LL 

II Calculate the Knee of the Curve. .. this is quick and avoids 
// having to pass a PRCB or other structure around 

LL 

for (I = (UCHAR) Perf StatesCount ; I >= 1; I--) { 



If (PerfStates [1-1] .Flag & POP THROTTLE NON LINEAR) { 



Knee = (1-1) ; 



Break; 



i 

i 

a 

1/ Look at all the states that happen before the knee and 
// set those to run only when the battery is fully charged 

Li 

for (I =0; I < PS tato - > Knee Throttlc Index ; I++) { 
// 

// Any of the steps before the knee are set to 100% 
// 

PerfStates [I] .MinCapacity = 100; 

} 

// 

// Calculate the range for which we clamp down the throttle 
// 

Num = Perf StatesCount - (PState->KneeThrottleIndex + 1) ; 
if (Num != 0) { 

Width = Total / Num; 

} 

// 

// Look at all the states from the Knee to the end. Starting 
// at the highest value, set the min capacity and subtract the 
// appropriate value to get the next min capacity. 
// 

for (I = PState->KneeThrottleIndex; I < Perf StatesCount ; I++) { 
// 

// We put a floor onto how low we can force the throttle 

// down to. If this state is operating below that floor, 

// then we should set the MinCapacity to 0, which 

// reflects the fact that we don't want to degrade 

// past <£his point 

// 

if (Perf State [I] . PercentFrequency < 

PopPerf DegradeThrottleMinFrequency) { 

// 

// We modify the min capacity for the 

// previous state since we don't ever 

// want to demote from that state. 

// Also, once we start being less than 

// the min frequency, the min capacity 

// will always be set to 0, except for 

// the last state. But this is okay since 

// we look at each state in order. We also 

// have to make sure that violate array 

// bounds, but this can only happen if 



// 
// 
// 

if 



the perf states array is badly formed 
or the min frequency is badly formed. 



(I 0) { 



PerfStates [1-1] .MinCapacity = 0; 



Perf States [I] . MinCapacitiy = 0; 
Continue; 



Perf State [I] .MinCapacity - Totals- 
Total -= Width; 



The logic behind this algorithm is that it is designed to allow the system to run at Lowest 
Voltage-Highest Frequency (henceforth the KneeState) while the battery capacity 
remaining is between 100% and PopPerf DegradeThrottleMinCapacity. Once 
the battery capacity falls below that level, the highest allowed state is the next-Highest 
Frequency available at the same voltage. When the battery capacity degrades past a 
certain amount (which is based on the PopPerf DegradeThrottleMinCapacity 
and the number of available states), the highest allowed state becomes the next-Highest 
Frequency remaining. Thus, as the battery capacity diminishes, the processor runs at a 
lower and lower frequency. 

Another feature of this algorithm is that it allows for the OS to specify what the smallest 
frequency that the system can be forced to degrade to. For example, if a system supports 
the following frequencies at the low voltage state: 50%, 40%, 30%, 20%, and 10%, then 
by setting PopPerf DegradeThrottleMinFrequency to 30% will guarantee that 
we will degrade the throttle below 30%. 

To minimize the amount of calculations we must make at run time, we can pre-calculate 
the minimum battery capacity that must be remaining to be in the indicated performance 
state. It should noted that any algorithm used must be able to handle where there are no 
more performance states after the knee state. 

The beauty of this algorithm is that is can be easily replaced. If the Performance team 
decides that a log, exp, or any other means of calculating the "curve" is more appropriate, 
then only this function needs to be changed. 

A sample equation could be: 



min Capacity - 




4.7 System IdleLoop 

To prevent the system from considering the promotion and demotion of performance 
states as part of how busy the system is, it is clearly desirable to invoke the increase and 
decrease of the CPU throttles from within the idle loop. 

The basic idea of the code in the idle loop is that the code should check to see what 
percentage (normalized to POP_PERF_SCALE) of time during the last interval was spent 
running the idle thread and how much time was spent doing actual work. Once the 
percentage of actual work done is calculated, then the Operating System should look at 
the Perf States table to see if this value falls within the operating parameters for the 
current bucket. It is important to make this check first because buckets are allowed to 
overlap each other. 

If the value does not fall within the current bucket, then the operating system finds the 
closest performance state for which the value matches the parameters of the bucket. 
There is an important distinction here since by picking the nearest bucket, the operating 
system will have a tendency to not pick the highest or lower performance states unless 
there is absolutely no other choice. The advantage of this algorithm is that it will pick the 
PercentFrequency closest to the calculated value. 

It is important to note that the Idle loop runs at DPC level and within the context of the 
processor for which it is targeted. That means that the code within the Idle loop cannot 
call anything that is marked as pageable. The benefit of running at DPC level is that no 
synchronization is required if the data structures used by the Idle loop can only be 
accessed by a thread running on the same processor at DPC level. This function should 
also be called before the C-State handler is invoked. 

The algorithm would look something like this: 

VOID 

PoPerfldle ( 

PPROCESSOR POWER STATE PState; 



BOOLEAN 
BOOLEAN 
BOOLEAN 
KPRCB 

PPROCESSOR PERF STATE 



C3Forced = FALSE; 
Promoted = FALSE; 
Demoted = FALSE; 
Prcb; 

Perf States; 

TickCount; 

I; 

J; 

Current Perf State 

Freq; 

IdleTime; 

TickCount; 



ULONC 
UCHAR 
UCHAR 
UCHAR 
UCHAR 
ULONG 
ULONG 



ULONG 
ULONG 
ULONG 



Time; 

TimeDelta; 

Perf StatesCount; 



// 

// This piece of code should actually be done in the main 
// PopIdleO or PopProcessorldle routines to save a function 
// call. However this code is included here for completeness 
// 

Prcb ~ CONTAINING_RECORD ( PStato, KPRCB, PowcrStato ); 
if ( !PState->Flags & PSTATE_ADAPTIVE_THROTTLE) { 
return; 

} 

// 

// Has enough time expired? 
// 

Prcb = CONTAINING RECORD ( PState, KPRCB, PowerState ); 
Time = POP CUR TIME (Prcb) ; 
IdleTime = Prcb->Idle Thread->KernelTime; 
TimeDelta = Time - PState->Perf SystemTime; 
if (TimeDelta < PopPerf TimeDelta) { 
return; 

} 

// 

// Remember what the perf states are 
// 

PerfStates = PState->Perf States; 

Perf StatesCount = PState->Perf StatesCount ; 

// 

// Find the bucket with the correct frequency 
// 

CurrentPerf State = PState->CurrentThrottleIndex; 
I = CurrentPerfState; 

// 

//At this point, we need to see if the number of C3 
// transitions have exceeded a threshold value, and if 
// so, then we really need to throttle back to the 
// KneeThrottlelndex since we save more power if the 
// processor is at 100% and in C3 than if the processor 
// is at 12.5% and in C3 . 
// 

Freq = CalculateC3Frequency (PState) ; 
If (Freq >= PopPerfMaxC3Frequency ) { 

// 

// Set the throttle to the lowest knee in 

// the voltage & frequency curve 

// 

I = PState->KneeThrottleIndex; 
if (CurrentPerfState > I) { 

Promoted = TRUE; 

} else if (CurrentPerfState < I) { 



Demoted = TRUE; 



} 

// 

// Remember why we are doing this 
// 

C3Forced = TRUE; 
II 

II Skip to setting the throttle 
// 

goto PoPerf IdleSetThrottle; 

} 

// 

// Calculate how busy the CPU is 
// 

Freq = CalculateldleFrequency (PState) ; 
// 

/ / Have we exceeded the the rmal throttle limit? 
// 

If (Freq > PState->ThermalThrottleLimit) { 
// 

// The following code will force the frequency to 

// only as busy as the Thermal Throttle Limit will 

// actually allow. This removes the need for complicated 

// algorithms later on. 

// 

Freq = PState->ThermalThrottleLimit ; 
I = PState->ThermalThrottleIndex; 

} 

// 

// Is there an upper limit to what the throttle can goto? 

// Note that because we check these after we have checked 

// the thermal limit, it means that it is not possible for 

// frequency to exceed the thermal limit that we have specified 

// 

if (PState->Flags & PSTATE_DEGRADED_THROTTLE) { 
// 

// Make sure that we don't exceed the 

// state that is specified 

// 

J = PState->ThrottleLimitIndex; 

If (Freq >= Perf States [J] . PcrccntFrcqucncy lncreaseLevel ) { 

Freq = Perf States [J] . IncreaseLevel ; 
I = J; 

} 

} else if (PState->Flags & PSTATE CONSTANT THROTTLE) { 



J = PState->KneeThrottleIndex; 

If (Freq >= Perf States [J] . IncreaseLevel PcrcGntFrcqucncy ) 

Freq = Perf States [J] . IncreaseLevel ; 
I-= J; 

} 

} 

// 

II Remember the adjusted value for informational purposes 
// only 

Li 

PState->LastAdjustedBusyPercentage = Freq; 
// 

// Find the processor frequency that best matches 

// the one that we have just calculated. Please 

// note that this algorithm is written in such 

// a way that I can only travel in a single 

// direction. It is possible to collapse the 

// following code down, but not without allowing 

// the possibility of I doing a "yo-yo" between 

// two states (and thus never terminating the 

// while-loop) . 

// 

if (Perf State [I] . IncreaseLevel < Freq) { 

If (I != 0) { 

Promoted = TRUE; 
I — ; 

} 

} else if (Perf States [I] . DecreaseLevel > Freq) { 

while (l) do { 

If (I==(PerfStatesCount-l) ) { 

// don't exceed the array 
break; 

} 

Demoted = TRUE; 
I++; 

M — (Perf States [I] . DccrcasGLcvcl <- Freq) — f- 



brcak; 

} while ( Perf States [I] . DecreaseLevel > Freq); 



} 



PoPerf IdleSetThrottle : 



7^ 

// Note wo need to do this now because we don't want 

//to exit this code path without having set or canceilcd 

// the timer as is appropriate. — The only exception to 

// this rule is in t h e case where the system hit the 
//, C3 limit. 

7^ 

// Cancel the timer under the following condi tions 
-H- 

if (I — 0) [ 

// We arc at 10Q Q c throttle, — so timer won' t 
// do much of anything 

KcCancclTimcr (& (PStatc >PcrfTimcr) ) ; 

] else if (PStatc - >Flags & PSTATE_CONSTANT_THROTTLE 
I — PStatc >KnccLimitIndcx) — f- 

// We arc at the maximum t hro ttle allowed 

KcCancclTimcr (& ( PState - >Pcrf Timer) ) ; 

) else if (PStatc - >Flags & PSTATE J3ECRADEDJTHROTTLE 
I — P£tatc - >ThrottlcLimitIndcx) — {- 

++ 

1/ We arc at the maximum throttle allowed 
KcCancclTimcr (& ( PStatc - >Perf Timer ) ) ; 
) else [ 
•H- 

// No restrictions that we can think of, 
//so set the timer. Note that the semantics 

// of KeSctTimcr are useful here if the 

// timer has already been set, — then this 
// resets it — (moves it to the non - signaled 
// state) — and recomputes the period. 

KeSctTimcr ( 

k PStatc - >Pcrf Timer, 

7777* 

. SPatatc - >PorfGpc 
-H- 

+ 
// 

//We have to make special allowances if we were forced to 
// throttle because of C3 considerations 



// 

if (!C3Forced) { 
// 

// See if enough time has expired to justify changing 

// the throttle. This code is here because certain 

// transitions are fairly expensive (like those across a 

// voltage state) while others are cheap. So the amount of 

// time required before we will consider promotion/demotion 

// from the expensive state might be longer than the 

// interval at which we run this function. 

// 

if {(Promoted && TimeDelta < Perf States [I] . IncreaseTime) II 
(Demoted && TimeDelta < Perf States [I] . DecreaseTime) ) { 

// 

// We haven't had enough time in the current 
// state to justify the promotion or demotion. 
// We don't update the bookkeeping since we 
// haven't considered the current interval as 
//as "success" . So, we just return . 
// 

// N.B. It is very important that we don't 

// update PState->PerfSystemTime here. If we 

// did, then it is possible that TimeDelta would 

// never exceed the thresholds required. 

// 

// Base our actions for the timer upon the current 
// state instead of the target state 

U_ 

PopSetTimer( PState, CurrentPerf State ); 
return; 

} 

} 

// At this point, — wc need to update the bookkeeping 
-H- 

PStatc - >Pcrf IdloTimo - IdlcTimc; 
PStatc - >Pcrf SyotcmTimc Time; 

PStatc - >Prc - viouoC3StatcTimo PState >TotalIdlcStatc [ 2 ] ; 
// 

1/ Note that we need to do this now because we do not want to 
// exit without having set or cancelled the timer as appropriate 

u_ 

PopSetTimer( PState, I); 
// 

// Update the promote and demote count 
// 

if (Promoted) { 



Perf States [CurrentPerf State ] . IncreaseCount++; 

PState ->PromotionCount++ ; 



} else if (Demoted) { 



Perf States [CurrentPerf State] . DecreaseCount++; 

PState->DemotionCount-t-+; 

} else { 
// 

// At this point, we realize that we aren't 

// promoting or demoting and all the bookkeeping 

// is in order, so the appropriate thing to do 

// is just return. 

// 

PState->Perf IdleTime = IdleTime; 
PState->Perf SystemTime = Time; 

PState->PreviousC3StateTime - PState->TotalIdleState [ 2 ] ; 
return; 

} 

// 

//We have a new throttle. Update the bookkeeping to 
// reflect the amount of time that we spent in the 
// previous state and reset the count for the next 
// state 
// 

PopSetThrottle ( 
PState, 
PerfStates, 
I, 

Time, 
IdleTime 

) ; 

} 

4.8 Processor Perf DPC 

A desirable feature to have in an adaptive throttling mechanism is the ability to sense that 
the CPU has become 100% and that the throttle should be increased if required. The way 
to accomplish this is to schedule a periodic timer that fires if the throttle is not set to 
100%. 

It is important to note that this DPC may fire in situations where the CPU is not 100% 
busy within a given time quantum. However, since the Idle Handler resets the timer count 
every time it runs, the number of spurious calls to this routine should be small. 

It is important to note that once the DPC has been fired, there is no need to cancel the 
timer since it is not scheduled as a periodic timer. 

The sample algorithm would look like this: 

VOID 

PopPerf Idle AdaptivoThrottlc Dpc ( 



IN 
IN 
IN 
IN 
) 



PKDPC Dpc, 
PVOID DpcContext, 
PVOID SystemArgumentl, 
PVOID SystemArgument2 



PKPRCB 
PKTHREAD 

PPROCESSOR_PERF_STATE 

PPROCESSOR_POWER_STATE 

UCHAR 

UCHAR 



UCHAR 

UCHAR 



ULONG 
ULONG 
ULONG 



Prcb; 

IdleThread; 
PerfStates; 
PState; 

Cur rent Per f State ; 

Freq; 



I; 

J; 

IdleTime; 
Time; 

TimeDelta; 



// 

// We need to fetch the PRCB and the PState structures. 
// We could easily call KeGetCurrentPrcfo ( ) here, but since 
// had room for a single context, why bother making the 
// inline call (which generates more code than using the 
// context field anyways) when we can simply remember it. 
// The memory for the context field is already allocated 
// anyways. 
// 

Prcb = (PKPRCB) DpcContext; 
PState = & (Prcb->PowerState) ; 

// 

// Remember what the perf states are 
// 

PerfStates = PState->Perf States; 

CurrentPerfState = PState->CurrentThrottleIndex; 
// 

// Lets see if enough kernel time has expired since 

// the last check... 

// 

Time = POP CUR TIME (Prcb) ; 

TimeDelta = Time - Pstate->Perf SystemTime; 

if (TimeDelta < PopPerf CriticalTimeTicksBeite) { 



PopSetTimer ( PState, CurrentPerfState ); 
return; 



// 

// How much time has expired on the Idle thread? 
// 

IdleThread = Prcb->IdleThread; 
IdleTime = IdleThread->KernelTime; 

Freq = PopCalculateBusyPercentage ( PState ); 

TimeDelta IdleTime PS ta to - >Porf IdleTime ; 

if (TimeDelta < PopPorf Cri ticalldlcTimcDcl ta ) — f- 
return; 



u_ 

//We allow for a delta so that we can specify a range 

//at which we should promote 

// 

Freq +^ (UCHAR) PopPerf CriticalFrequencyDel ta ; 
// 

II Remember which index we are currently looking at 

LL 

I = CurrentPerf State; 

l± 

1/ Have we exceeded the thermal throttle limit? 

LL 

If (Freq > PState->ThermalThrottleLimit ) { 

LL 

II The following code will force the frequency to 

// only as busy as the Thermal Throttle Limit will 

// actually allow. This removes the need for complicated 

// algorithms later on. 

LL 

Freq = PS tate->ThermalThrottleLimit ; 
I = PState->ThermalThrottleIndex; 

I 

LL 

II Is there an upper limit to what the throttle can goto? 
// Note that because we check these after we have checked 
// the thermal limit, it means that it is not possible for 

// frequency to exceed the thermal limit that we have specified 

LL 

if (PState->Flags & PSTATE DEGRADED THROTTLE) { 

LL 

1/ Make sure that we don't exceed the 
// state that is specified 

LL 

J = PState->ThrottleLimitIndex; 

If (Freq >= Perf States [J] . PercentFrequency) { 

Freq = Perf States [ J] . PercentFrequency ; 
I = J; 

I 

} else if (PState->Flags & PSTATE CONSTANT THROTTLE) { 

J = PState->KneeThrottleIndex; 

If (Freq >= Perf States [ J] . PercentFrequency) { 

Freq = Perf States [J] . PercentFrequency ; 
I = J; 

} 



} else if (pState->ThermalThrottleLimit == 0) { 



LL 

II This state is special we can only goto to 

// the fastest state if there are no thermal restrictions 

LL 

l - 0; 

Freq = Perf States [ I ]. PercentFrequency; 

I 

U_ 

II Remember this value for user information purposes 

LL 

PState->LastAdjustedBusyPercentage = Freq; 

LL 

II If this freq exceeds what we are currently running at, 
// the we should promote, otherwise, do nothing except 
//to set the timer 

LL 

If (Freq < pState->CurrentThrott le) { 

PopSetTimer( PState, CurrentPerf State ); 
Return; 

i 

LL 

II Set the timer based upon what the new state will be 

LL 

PopSetTimer( PState, I ); 

LL 

II Update the promote count 

LL 

if (I < CurrentPerfState) { 

Perf States [ currentPerf State ] . IncreaseCount++ ; 
PState->PromotionCount++; 

} else { 

ASSERT ( I < CurrentPerfState ); 
Return ; 

I 

// At this point, — we think that the idle thread is 
// stalled and that wc need to do something fast to 
// get it moving... Like setting it to the perf state 
// that corresponds to the "knee" of the graph 
// or a perf state higher than the current one... 

if (PStatc - >Flags & PSTATE CONSTANT THROTTLE) — f 



44- 

// Pick the knoc of the curve 
4+ 

I - PStatc - >KnccThrottlcIndcx; 

} else if (PStatc - >FlagG S PSTATE_DEGRADED__THROTTLE) — { 
++ 

II Pick the maximum that we arc allowed to goto 

I - PState - >ThrottlcLimitIndcx; 
] else [ 

++ 

II G o t o 100% 
-H- 

1 ~ 0; 

+ 
// 

// Set the new throttle 
// 

PopSetThrottle ( 
PState, 
Perf States, 

I, 

Time, 
IdleTime 
) ; 

} 



4.9 Setting the Watchdog Timer 

To properly set the watchdog timer that must be run (in case the system becomes to 
busy), the following function is to be called: 

NTSTATUS 
PopSetTimer ( 

IN PPROCESSOR POWER STATE PState, 

IN UCHAR Index 

1 

i 

NTSTATUS Status; 

LARGE INTEGER DueTime; 

LL 

II Cancel the timer under the following circumstances 

LL 

if (Index == Q) { 



LL 

II Already running at 100%, so we can't do anything to 



// make the computer run faster 



LL 

KeCancelTimer ( (PKTIMER) &( PState->Perf Timer ) ); 
Status = STATUS CANCELLED; 

} else if (PState->Flags & PSTATE CONSTANT THROTTLE && 
Index PState->KneeThrottleIndex) { 

LL 

II We are already running at the maximum constant 
// throttle allowed 

LL 

KeCancelTimer ( (PKTIMER) &( PState->Perf Timer ) ); 
Status = STATUS CANCELLED; 

} else if (PState->Flags & PSTATE DEGRADED THROTTLE && 
Index == PState->ThrottleLimitIndex) { 

LL 

//We are already running at the maximum de graded 
/ / throttle allowed 

LL 

KeCancelTimer ( (PKTIMER) &( PState->Perf Timer ) ) ; 
Status - STATUS CANCELLED; 

} else { 

■ LL 

1/ No restrictions that we can think of, so set the 

// timer. Note that the semantics are useful here if 

// the timer has already been set, then this resets 
// it (moves it back to the non-signaled state) and 
// recomputes the period 

LL 

DueTime .QuadPart = -1 * PopPerf CriticalTimeDelta; 
KeSetTimer ( 

(PKTIMER) & (PState->PerfTimer) , 
dueTime, 

& (PState->PerfDpc) 
lL 

status = STATUS SUCCESS 

i 

return status; 

} 



5.0 Initialization Changes 

5.1 Global Variable Initialization 

The following global variables must be initialized at the same time that the kernel is 
loaded. To simplify changing these values later on, the Kernel will provide some default 
values that can be overridden by the registry. 

• PopPerfTimeDelta: A value in the same units as PRCB 

>Ker.nelTime microseconds that corresponds to the Time Delta that must have 



occurred before the Idle thread will attempt to determine how busy the CPU was 
during the previous interval The kernel will convert this number to the same units 
as PRCB->KernelTime in a variable known as PopPerfTimeTicks. 

• PopPerfCriticalTimeDelta: A value in the same units as PRCB - 
>KernelTime microseconds that corresponds to the Time Delta that must have 
occurred before the Timer DPC will attempt to determine how busy the CPU was 
during the previous interval . The kernel will convert this number to the same units 
as PRCB->KernelTime in a variable known as PopPerfCriticalTimeTicks. 

HPopPerfCriticalldleTimeDelta: A value in the same units of PRCB >KernelTime 
that corresponds to the Time Delta that must have occurred for the IdleThread 
during a PopPerfCriticalTimeDelta period. If this tim e has not occurred, then the 
throttle will be raised by the TimerDPC. 

• PopPerfCriticalFrequencyDelta. This is a percentage to add when calculating 
CPU busyness in the watchdog timer. This value will allow for a faster triggering 
of the watchdog, under ligher loads. The recommended value is 0. 

• PopPerflncreasePercentModifier: A value between 0 and 100 where lower means 
that overall IncreaseLevel value will be higher (and thus promotions won't occur 
as frequently) that indicates what percentage of the delta between the current state 
and the state to promote to should be used to set the promote level. A suggested 
value would be 0% 

• PopPerflncreaseAbsoluteModifier: A value between 0 and 100 where lower 
means that the overall IncreaseLevel value will be higher (and thus promotions 
won't occur as frequently) that indicates how many extra percentage points to 
remove to the promote level. It should be noted that if the value is particularly 
high, then it might not be possible to promote from this state. A suggested value 
would be 1%. 

• PopPerfDecreasePercentModifier: A value between 0 and 100 where higher 
means that overall DecreaseLevel value will be lower (and thus demotions won't 
occur as frequently) that indicates what percentage of the delta between the 
current state and the state to demote to should be used to set the demote level. A 
suggested value would be 50% 

• PopPerfDecreaseAbsoluteModifier: A value between 0 and 100 where higher 
means that the overall DecreaseLevel value will be lower (and thus demotions 
won't occur as frequently) that indicates how many extra percentage points to 
subtract to the demote level. It should be noted that if the value is particularly 
high, then it might not be possible to demote from this state. A suggested value 
would be 1%. 

• PopPerflncreaseTime Value: A value in the same unit s as PRCB - 
>KemelTime microseconds that corresponds to the Time Delta that must have 
occurred before a Throttle Increase is considered. This value should be in multiple 
of PopPerfTimeDelta. This value may also serve a basis for calculating different 
time increments for each Throttle Step. 

• PopPerflncreaseMlninimumTime. A value in microseconds that corresponds to 
the absolute minimum amount of time since a promotion has occurred before 
another one is allowed. A recommended value is 300 rns. 



• PopPerfDecreaseTime Value: A value in the same u nits as P R C B 
>KeiTi e lTime microseconds that corresponds to the Time Delta that must have 
occurred before a Throttle Decrease is considered. This value should be in 
multiple of PopPerfTimeDelta. This value may also serve a basis for calculating 
different time increments for each Throttle Step. 

• PopPerfDecreaseMmimumTime. A value in microseconds that corresponds to the 
absolute minimum amount of time since a demotion has occurred before another 
one is allowed. A recommended value is 1000 ms. 

nPopProcPerfStateLoolcAsideList: This is a lookaside list that will be used to 
allocate PROCESSOR_PERF__STATE structures. 

• PopPerfDegradeThrottleMinCapacity: A value between 0 and 100 that represents 
at what point of battery capacity we will start forcing down the throttle when we 
are in the Degraded Throttling mode. For example, a value of 50% means that we 
will start throttling when the CPU reaches 50%. 

• PopPerfDegradeThrottleMinFrequency: A value between 0 and 100 that 
represents the lowest frequency that we can force the throttle down to when we 
are start the Degraded Throttling mode. For example, a value of 30% means that 
we will force the throttle below 30%. 

• PopPerfMaxC3Frequency: A percentage value that represents the maximum 
amount of time that was spend in C3 for the last quanta before the idle look will 
decide that it should optimize for C3 usage. A sample value would be 50%. 

5.2 PolnitializePrcb() 

The PROCESSOR_POWER_STATE structure is initialized by PoInitializePrcb(). The 
following changes must occur: 

VOID 
FASTCALL 

PoInitializePrcb ( 

PKPRCB Prcb 
) 

{ 

// 

// Zero power state structure 
// 

RtlZeroMemory ( &Prcb->PowerState, sizeof (Prcb->PowerState) ) ; 
// 

// Initialize to legacy functions with promotion from it disabled 
// 

Prcb->PowerState.IdleOKernelTimeLimit = (ULONG) -1; 
Prcb->PowerState . IdleFunction = PopIdleO; 
Prcb->PowerState.CurrentThrottle = POP_PERF_SCALE ; 

// 

I / Calculate this value exactly once 
// 

KeQueryPerf ormanceCounter ( 

&Prcb->PowerState . Per f CounterFrequency 
) ; 



// Initialize the Adaptive throttling subcomponents 
// 

KelnitializeDpc ( 

& (Prcb->PowerState . Perf Dpc) , 
PopPerf Adapt iveThrottleDpc, 

Prcb Nm 
>; 

KeSetTargetProcessorDpc ( 

& (Prcb->PowerState . Perf Dpc) , 

Prcb->Number 

); 

KelnitalizeTimer ( 

& (Prcb->PowerState . Perf Timer) 
); 



5.3 PopSetPerfLevels() 

The PROCESSOR_PERF_STATE structure is initialized by calls to PopSetPerfLevels(). 
The initialization can occur as before, when it was initializing 
PROCESSOR_PERF_LEVEL instead. 



NTSTATUS 

PopSetPerf Levels ( 

IN PPROCESSOR STATE 



{ 



) 



BOOLEAN 



HANDLER2 ProcessHandler 



FailedAllocation = FALSE; 



KAFFINITY 

KIRQL 

PKPRCB 

NTSTATUS 

ULONG 

ULONG 

UCHAR 

UCHAR 

UCHAR 



Processors, CurrentAf f inity; 

Oldlrql; 

Prcb; 

Status = STATUS_SUCCESS; 

i; 

Perf StatesCount = 0; 
Freq; 

KneeThrottlelndex = 0; 

MinThrottle; 



UCHAR 



MaxThrottle; 



UCHAR 

PPROCESSOR_PERF_STATE 
PPROCESSOR_PERF_STATE 
PPROCESSOR POWER STATE 



ThermalThrottlelndex = 0; 
PerfStates = NULL; 
TempStates; 
PState; 



// 

// The first step is to convert the data that was passed to us 

// PROCESSOR_PERF_LEVEL over to PROCESSOR_PERF__STATE . 

// 

if (ProcessorHandler->NumPerf States) { 



// 

// Because we are using going to allocate the PerfStates 

// array first so that that we can work on it, then copy 

// it to each processor, we must eaf* do that allocation from 

// non- paged pool. 



// 

Perf StatesCount = ProcessorHandler->NumPerf States; 
Perf States = ExAllocatePoolWithTag ( 
Non PagedPool, 

Perf StatesCount * sizeof (PROCESSOR_PERF_STATE) , 
^sPoP' 

) ; 

if (PerfStates « NULL) { 
// 

// We can handle this case. We will set the 

// status code to an appropriate .failure code 

// and we will clean up the existing processor 

// states. The reason we do that is because 

// this function only gets called if the current 

// states are invalid, so keeping the current ones 

// would make no sense. 

// 

s-Status = STATUS_INSUFFICIENT_RESOURCES; 

Perf StateCount = 0; 

goto PopSetPerf LevelsSetNewStates ; 

} 

RtlZeroMemory ( 

PerfStates , 

Perf StatesCount * sizeof ( PROCESSOR PERF STATE) 
lL 

II 

I / Initialize each of the PROCESSOR_PERF_STATE entries 
// 

for (i =0; i < Perf StatesCount ; i++) { 

Perf States [i] . PercentFrequency = 

ProcessorHandler->PerfLeve [i] . PercentFrequency; 
Perf States [i] . Power = 

ProcessorHandler->Perf Level [i] . Power; 

} 

// 

// Analyze the PerfStates to determine which entries are 

// linear /non-linear 

// 

PopAnalyzePerf States { PerfStates, Perf StatesCount ); 
// 

// Calculate the increase level, decrease level, and 

// increase/decrease time 

// 

Pop CalculatelncreaseLevel ( PerfStates, Perf StatesCount ); 
Pop CalculateDecreaseLevel ( PerfStates, Perf StatesCount ); 
PopCalculateMinCapacity ( PerfStates, Perf StatesCount ); 
Pop CalculatelncreaseDecreaseTime ( 

PerfStates, 

Perf StatesCount, 

PerfHandler 



) ; 

// 

// Calculate where the Knee in the performance curve is 
// 

for (i=PerfStatesCount; i >= 1; i++) { 

if (PerfStates[i-l] .Flags & POP_THROTTLE_NON_LINEAR) { 

KneeThrottlelndex = i-1; 

Break; 

} 

} 

// 

I / Find the minimum throttle value which is greather than 
/ / the specified default and the current max throttle 

u_ 

MinThrottle = POP PERF SCALE; 

MaxThrottle = 0; 

For (I ^ 0; I < Perf StatesCount ; I++) { 

Freq = Perf States [ I ]. PercentFrequency; 
If (Freq < MinThrottle && 

Freq >= PopIdleDef aultMinThrottle) { 

MinThrottle = Freq; 

I 

If (Freq > MaxThrottle && 

Freq >= PopIdleDef aultMinThrottle) { 

MaxThrottle ^ Freq; 
ThermalThrottlelndex = (UCHAR) I; 

I 

i 
it 

II Make sure that we can run 

LL 

ASSERT ( MaxThrottle PopIdleDef aultMinThrottle ); 



PopSetPerf LevelsSetNewStates : 

If (! Perf States) { 

l± 

II Remember that our min and max are 100% 

LL 

MinThrottle = MaxThrottle = POP PERF SCALE; 



} 



if (PcrfStatco) — {- 
■N- 

II Wc have pcrf otatco, — oo remember that in our 
// capabilitico 

PopCapabilitico . ProccooorThrottlc - TRUE; 
-H- 

// Find the minimum throttle value >~ 

// PopIdlcDcf aultMinThrottlc and the current maximum 
// throttle value 

PopCapabilitico .ProccooorMinThrottlc - POP_PERF_£CALE ; 
PopCapabilitico . ProccooorMaxThrottlc ~ 0; 

— (i~ 0 ; i <ProcGooorHandlcr - >NumPcrf States ; in) — f 

Frcq ~ Pcrf Statco - >Pcrf Level [i] . PcrccntFrcqucncy; 
4r£ — ((Frcq < PopCapabilitico . ProccooorMinThrottlc) — Sr& 
(Frcq >- PopIdlcDcfaultMinThrottlc) ) — {- 

PopCapabilitico ■ ProccaoorMinThrottlG ~ Frcq; 

4-f — ((Frcq > PopCapabilitico . ProccooorMaxThrottlc) — 
(Frcq >- PopIdlcDcfaultMinThrottlc) ) — b 

PopCapabilitico . ProccooorMaxThrottlc - Frcq; 
ThcrmalThrottlcIndcx - i; 

+ 

// There better be SOME opced wc can run at. 

ASSERT (PopCapabilitico . ProccooorMaxThrottlc >~ 
PopIdlcDcfaultMinThrottlo) ; 

-j — cloc — {- 

II Wc don't have any pcrf oatco, — oo remember that in our 

// capabilitico 

-H- 

PopCapabilitico . ProccooorThrottlc - FALSE; 
PopCapabilitico . ProccooorMaxThrottlc POP_PERF_SCALE; 
P o p Ca pabilitico . ProccaoorMinThrottlG POP_PERF_SCALE; 

+ 
// 

// Initialize the PPROCESSOR_POWER_STATE for each processor 
// 

Processors = KeActiveProcessors; 
CurrentAf f inity = 1; 



while (Processors) { 



if (! (Processors & CurrentAf f inity) { 

CurrentAf f inity «= 1; 

Continue ; 

} 

// 

// Remember that we did this processor and make sure that 
// we are actually running on that processor. This ensures 
// that we are synchronized with the DPC and idle loop 
// routines 
// 

Processors &= -CurrentAf f inity; 
KeSetSystemAf f inityThread (CurrentAf f inity) ; 
CurrentAf f inity <<= 1; 

// 

// To make sure that we aren't pre-empted, we must raise 

// to DISPATCH_LEVEL 

// 

KeRaiselrql (DISPATCH_LEVEL, &01dlrql ); 
// 

// Get the PRCB and PPROCESSOR_POWER_STATE structures that 

// we will need to manipulate 

// 

Prcb = KeGetCurrentPrcb {) ; 
PState = &Prcb->PowerState; 

// 

// Remember what our thermal limit is 
// 

PState->ThermalThrottleLimit = MaxThrottle; 

PopCapabilitico . ProcGsoorMaxThrottlc ; 
PState->ThermalThrottleIndex = ThermalThrottlelndex; 

LL 

I 7 Likewise/ remember what the min and max throttle are 
it 

PState->ProcessorMinThrottle = MinThrottle; 
PState->ProcessorMaxThrottle = MaxThrottle; 

// 

// To get the bookkeeping to work out correctly, we will 
// set the throttle to 0% (which is not possible), set the 
// current index to the last state, and set current tick 
// count to the current time. 
// 

PState->CurrentThrottle = 0; 

PState->PerfTickCount = POP CUR TIME ( Prcb) ^me; 
If (Perf StatesCount) { 



PState->CurrentThrottleIndex = Perf StatesCount - 1; 



} else { 

PState->CurrentThrottleIndex = 0; 

} 

// 

// Reset the Knee Index. This indicates where the knee 

//in the performance curve is 

// 

PState->KneeThrottleIndex = KneeThrottlelndex; 
// 

// Reset the Throttle Limit Index 
// 

PState->ThrottleLimitIndex = KneeThrottlelndex; 

// 

II Reset these value since it doesn f t make much sense to 
// keep track of these across state changes 

u_ 

PState->PromotionCount = 0; 
PState->DemotionCount = 0; 

u_ 

II Reset the values to something that makes sense. We can 
// assume that we started at 100% busy and 0% C3 Idle 

u_ 

PState->LastBusyPercentage = 100; 
PState->LastAdjustedBusyPercentage - 100; 
PState->LastC3Percentage - 0; 

// 

// If there are already perf states present for this 

// processor, then free them 

// 

if (PState->PerfStates) { 

ExFrGcToNPagcdLookasidcList ( 

&PopProcPcrf StatcLookAsidcList , 
PStatc - >Pcrf States 

ExFreePool ( PState->Perf States ) ; 
PState->PerfStates = NULL; 
PState->PerfStatesCount = 0; 

} 

// 

// At this point, we have to distinguish our behavior based 

// on whether or not we have perf states 

// 

if (PerfStates) { 
// 

//We do, so let allocate some memory and make a copy 
// of the template that we already created. 



// 

TempStates - ExAllocatcFromNPagcdLookaoidcLiot ( 
&PopProcPcrf StatcLookAsidcLiot 

TempStates = ExAllocatePoolWithTag ( 
NonPagedFool, 

PerfStatesCount * sizeof (PROCESSOR PERF STATE) , 
'sPoP' 

if (TempStates == NULL) { 
// 

// Not being able to allocate this structure 
// is surely fatal. The only way to get around 
// it (I think) is to break out of this case 
// and treat it as if there are no PerfStates 
// available. 
// 

statues = STATUS_INSUFFICIENT_RESOURCES; 

f ailedAllocation = TRUE; 

PState->Flags &= -PSTATE SUPPORTS THROTTLE; 

PState->Perf$etThrottle = NULL; 

KeLowerlrql ( oldlrql ) ; 

Continue; 

KoBugChcckEx ( 

INTERNALJ?OWER_FAILURE, 
_g 

STATUS_INSUFF ICIE NT_RESOURCES, 
LINE , 

} else { 
// 

// Copy the template to the one associated with 

// the processor 

// 

RtlCopyMemory ( 

TempStates, 
PerfStates, 
PerfStatesCount * 

sizeof (PROCESSOR_PERF_STATES) 

) ; 

PState->Perf States = TempStates; 
PState->Perf StatesCount = PerfStatesCount; 

} 

LL 

// Remember that we support throttling 

LL 

PState->Flags = PSTATE SUPPORTS THROTTLE; 
PState->PerfSetThrottle = 

ProcessorHandler->SetPerf Level ; 



// 

II Update the processor status now 
// 

PopUpdateProcessorThrottle ( ) ; 
} else { 
// 

II Remember that we don't support throttling 
// 

PState->Flags = 0; 
PState->Perf SetThrottle = NULL; 

i 

// Update the processor throttle function 
-H- 

if ( PS tatc - >Pcrf States) — [- 

PState - >PopSctThrottle - 

ProccssorHandlcr - >SctPcrf Level ; 

) else ( 

PStatG - >PopSctThrottlc ~ NULL; 

j 

-H- 

II Update — the pr o ces s or throttle — (since wc arc already 

-H- — running on the target processor 

+4- 

Po pU pdateProcessorThrottle () ; 
// 

//We can now return to our previous IRQL 
// 

KeLowerIrql( Oldlrql ); 

} 

!± 

II Did we fail an allocation and thus require a cleanup? 

a 

if (FailedAllocation) { 

processors = KeActiveProcessors ; 
currentAf f inity = 1; 
while (processors) { 

if (iprocessors & currentAf finity) ) { 

currentAf f inity <<= 1; 
continue; 



} 



processors &^ -currentAff inity; 
KeSetSystemAf f inityThread ( currentAf f inity ) ; 
CurrentAf f inity <<= 1; 



KeRaiselrqK DISPATCH LEVEL, &01dlrql ); 



Prcb = 
PState 


KeGetCurrentPrcb { ) ; 
- & (Prcb->PowerState) ; 










LL 

II Reset the PowerState 

LL 

PState->ThermalThrottleLimit 




POP 


PERF 


SCALE; 


PState- 


>ThermalThrottle Index 




0; 






PState- 


>ProcessorMinThrottle 




POP 


PERF 


SCALE; 


PState- 


>ProcessorMaxThrottle 




POP 


PERF 


SCALE; 


PState- 


>Cur rent Throttle 




POP 


PERF 


SCALE; 


PState- 


>PerfTickCount 




POP 


CUR TIME (Prcb) / 


PState- 


>CurrentThrottle Index 




0; 






PState- 


>KneeThrot tie Index 




0; 






PState- 


>ThrottleLimit Index 




0; 






u_ 

II Free 


Allocated structures 


if 


any 





LL 

if (PState->PerfStates) { 

maxThrottle = 

PState->Perf States [0] . PercentFrequency; 
ExFreePool ( PState->Perf States ); 

} else { 

maxThrottle = POP PERF SCALE; 

I 

pState->PerfStates = NULL; 
pState->PerfStatesCount = 0; 

LL 

II Return to 100% if possible 

LL 

if (PState->PerfSetThrottIe) { 

PState->PerfSetThrottle ( maxThrottle ); 

i 

PState->Flags = 0; 
PState->PerfSetThrottle = NULL; 

LL 

II Return to previous IRQL 

LL 

KeLowerIrql( Oldlrql ); 



LL 

// Set the global caps 
// 

PopCapabilities . ProcessorThrottle = FALSE; 
PopCapabilities . ProcessorMinThrottle - POP PERF SCALE; 
PopCapabilities . ProcessorMaxThrottle = POP PERF SCALE; 

} else { 

LL 

1 7 Remember these caps 

LL 

PopCapabilities . ProcessorThrottle (Perf States ! = NULL); 
PopCapabilities . ProcessorMinThrottle = minThrottle ; 
PopCapabilities . ProcessorMaxThrottle = maxThrottle; 

i 
// 

// Return to the proper affinity 
// 

KeRevertToUserAf f inityThread ( ) ; 

LL 

1/ Free the memory we used 

LL 

If (PerfStates) { 

ExFreePooK Perf States ); 

I 
// 

// Return whatever status code we have 
// 

return status ; 

} 

This function has been extensively changed from the existing code base. The first and 
most obvious change is the fact that we allocate per-processor perf state arrays. This 
means that we have to be able to handle the case where we cannot allocate such an array. 
The solution to this problem is to basically treat that failure as an inability of the system 
to throttle. 

The changes to this function also means that a great deal of additional intelligence will 
have to be present in PopUpdateProcessorThrottle ( ) . 

The purpose of this algorithm is to flush each processor from using the old processor 
state tables. This is accomplished by running the thread in the context of each processors. 
Since this structure is only accessed within the context of a TimerDPC and the 
IdleThread, which both run at DPC level, and thus cannot be pre-empted, this 
guarantees that neither the IdleThread nor the TimerDPC can be running. If neither 



are running, then neither are using the old copy of the structure, which means that it can 
be safely freed and the new one used in its place. 

The only potential problem with this algorithm is dealing with a potential low memory 
situation. It is not acceptable to share copies of the Perf States structure among 
multiple processors. That leaves as solutions either guaranteeing that we don't run into 
low memory problem or issuing a bugcheck if we do. To avoid running into low memory 
problems, the Perf States structures can be allocated from an N PAGE D_LOOKAS IDE 
list. A further optimization is that if the old Per f States structure is the same size or 
larger than the new one, it could be used instead. 

5.4 PopUpdateAIIThrottles() 

This function is used to update all the throttles simultaneously. 

VOID 

PopUpdateAllThrottles { 
VOID 
) 

{ 

KAFFINITY Processors; 
KAFFINITY CurrentAf f inity ; 
KIRQL Oldlrql; 

KPRCB PState; 

Processors = KeActiveProcessors ; 
CurrentAf f inity = 1; 
while (Processors) { 

if (Processors & CurrentAf f inity) { 

Processors &= -CurrentAf f inity; 
KeSetSystemAf f inityThread (CurrentAf f inity) ; 

// 

// We must call PopUpdateProcessorThrottle 

// at DISPATCH_LEVEL 

// 

KeRaiselrql ( D I S P ATCH_LE VE L , &01dlrql ) ; 

PState = & (KeGetCurrentPrcbQ ->PowerState) ; 

If (PState->Flags & PSTATE SUPPORTS THROTTLES) { 

PopUpdateProcessorThrottle () ; 

I 

KeLowerIrql( Oldlrql ); 

} 

CurrentAf f inity «= 1; 

} 

KeRevertToUserAf f inityThread ( ) ; 

} 



The principle change to this code is that we no longer have an early exit case if there are 
no perf states registered. In theory, we could add a check that would look at 

PopCapabilities . ProcessorThrottle. 

The other change is that we will always call PopUpdateProcessorThrottle at 
DISPATCH_LEVEL. This will guarantee that the DPC routine will not be able to pre- 
empt the routine. 

5.5 PopUpdateProcessorThrottles() 



VOID 

PopUpdateProcessorThrottles ( 
VOID 
) 



{ 



PKRPRCB 

PPROCESSOR_PERF_STATE 

PPROCE S S OR_POWER_S TATE 

UCHAR 

UCHAR 

UCHAR 

UCHAR 

ULONG 

ULONG 



Prcb; 

Perf States; 

PState; 

i; 

Index; 

NewLimit; 

Perf StatesCount; 

IdleTime 

Time; 



// 

// Get the PowerState structure from the PRCB 
// 

Prcb = KeGetCurrentPrcbO ; 
PState = &Prcb->PowerState; 

// 

// Make sure that this processor supports throttling 
// 

If (PState->PopSetThrottle == NULL) { 



Return; 



} 



// 

// Get the current information such as curren throttle, current 

// throttle index, current system time, current idle time 

// 

NewLimit = PState->CurrentThrottle; 
Index = PState->CurrentThrottleIndex; 
Time - CURJTIME (Prcb) ; 

IdleTime = Prcb->IdleThread->KernelTime; 
// 

// We will need to refer to these frequently 
// 

PerfStates = PState->Perf States; 



PerfStatesCount = PState->Perf StatesCount ; 
// 

// If we are on AC, then we always want to run at the highest 
// possible speed. Also the same algorithm is used on DC if the 
// dynamic throttling policy is PO_THROTTLE_NONE . 
// 

if ( (PopPolicy == SPopAcPolicy) | | 

(PopPolicy->DynamicThrottle == PO_THROTTLE_NONE) ) { 

// 

// We precompute what the max throttle should be 
// 

Index = PState->ThermalThrottleIndex; 
NewLimit = Perf States [ Index] . PercentFrequency; 

} else { 
// 

//We are on DC, apply the appropriate heuristics based on 

// the dynamic throttling policy. 

// 

switch (PopPolicy->DynamicThrottle) { 
case PO_THROTTLE_CONSTANT : 

// 

// We have pre-computed the optimal point on the 

//■already. So, we might as well use that. 

// 

Index = PState->KneeThrottleIndex; 

NewLimit = Perf States [ Index] . PercentFrequency ; 

// 

// Set the constant flag and clear the degraded flag 
// 

PState->Flags &= ~PSTATE_DEGRADED_THROTTLE; 
PState->Flags |= PSTATE_CONSTANT_THROTTLE ; 
PState->Flags |= PSTATE_ADAPTIVE_THROTTLE; 
break; 

case PO_THROTTLE_DEGRADE : 
// 

// We calculate the limit of the degrade throttle 

// on the fly. 

// 

Index = PState->ThrottleLimitIndex; 

NewLimit = Perf States [ Index] . PercentFrequency ; 

// 

// Set the degraded flag and clear the constant flag 
// 

PState->Flags &= ~PSTATE_CONSTANT_THROTTLE; 
PState->Flags |= PS TATE DEGRADE D CONSTANT THROTTLE; 
PState->Flags |= PSTATE_ADAPTIVE_THROTTLE; 
break; 



case PO_THROTTLE_ADAPTIVE: 

PState->Flags |= PSTATE_ADAPTIVE_THROTTLE; 
break; 

default : 

// not implemented 
PoPrint ( 

PO_THROTTLE, 

( "PopUpdateProcessorThrottle - unimplemented" 
" dynamic throttle %d\n", 
PopPolicy->DynamicThrottle) 
) ; 

break; 

} 

} 

// 

// Check if we are over the thermal limit. 
// 

ASSERT (PState->ThermalThrottleLimit >= 

PopCapabilities . ProcessorMinThrottle) ; 
if (NewLimit > PState->ThermalThrottleLimit ) { 

PoPrint ( 

PO_THERM, 

("PopUpdateProcessorThrottle - new throttle limit %d" 

" over thermal limit %d\n", 

NewLimit, 

PState->ThermalThrottleLimit) 
); 

NewLimit = PState->ThermalThrottleLimit ; 
Index = PState->ThermalThrottleIndex; 

} 

// 

// Apply new throttle if it has changed. 
// 

if (NewLimit != PState->CurrentThrottle) { 

PoPrint ( 

POJTHROTTLE, 

{ "PopUpdateProcessorThrottle - Setting CPU throttle w 
"to %d\n", NewLimit) 
) ; 

if (newLimit < PState->CurrenThrottle) { 
PState->DemotionCount+-f ; 

Perf States [PState->CurrentThrottleIndex] . 

DecreaseCount-t-+; 

} else { 

PState->PromotionCount++; 
PerfStates [PState->CurrenThrottleIndex] . 
IncreaseCount++; 



} 



PopSetThrottle ( 
PState, 
Perf State, 
Index, 
Time, 
IdleTime 
) ; 

} 

} 

It should be noted that this function can only be called within the context of the target 
processor. This function does not acquire any spinlocks because it is running at 
DISPATCH_LEVEL, thus preventing the Timer DPC and the Idle Thread from running 
on this processor. 

5.5 PopApplyThermalThrottle() 

VOID 

PopApplyThermalThrottle ( 
VOID 
) 

{ 

// 

// <Code which is not relevant to this spec here> 
// 

fif DBG 

PoPrint ( 

PCjTHERM, 

("Th e rma l Zone %p %s Thermal throttle - %d.%dn", 

thormalZono, — fe-r 
(thcrmalThrottlc / 10), 
(thormalThrottlo °o 10)) 

P o P r in t ( 

POJTHERM, 

("Thermal Z o n e %p %-s Forced throttle %d.°Gd\n", 

thormalZono, — t-r 
(forccdThrottlc / 10), 
(forcedThrottle I 10)) 

ttcndif 

// 

// Set limit on effected processors 
// 

processorNumber — 0; 
currentAf f inity = 1; 
processors = KeActiveProcessors; 



do { 



if (-(processors & currentAf f inity) { 

currentAf f inity «= 1; 
continue; 

} 

processors &= -currentAf f inity; 
// 

// We must run on the target processor 
// 

KeSetSystemAf f inityThread (currentAf f inity) ; 

pStatc - & (KcGGtCurrcntPrcb () - >PoworStatc) ; 
// 

// We need to be running at DISPATCH_LEVEL to access the 

// structures referenced within the pState... 

// 

KeRaiselrqK DISPATCH_LEVEL, &01dlrql ); 

pState = & (KeGetCurrentPrcb () ->PowerState) ; 

if ( (pState->Flags & PSTATE SUPPORTS THROTTLE) 0) { 

CurrentAff inity «= 1; 

KeLowerlrqK Oldlrql ); 

Continue ; 



I 
// 

// Convert throttles to processor bucket size. We need to 
// do this in the context of the target processor processor 
// to make sure sure that we get correct set of perf levels 
// 

PopRoundThrottle ( 

(UCHAR) (thermalThrottle/PO_TZ_THROTTLE_SCALE) , 

SthermalLimit, 

NULL, 

&thermalLimit Index, 
NULL 

) ; 

PopRoundThrottle ( 

(UCHAR) (forcedThrottle/PO_TZ_THROTTLE_SCALE) , 

&f orcedLimit, 

NULL, 

&f orcedLimit Index, 
NULL 

>; 

#if DBG 

PoPrint ( 

PO_THERM, 

("Thermal - Zone %p - %d - Thermal Limit = %d\n", 

thermalZone, Prcb->Number , thermalLimit ) 

>; 



PoPrint { 

PO_THERM, 

("Thermal - Zone %p - %d - Forced Limit = %d\n", 
thermalZone, Prcb->Number , forcedLimit) 
) ; 

lendif 

// 

// Figure out which one we are going to use 
// 

limit = (thermalProcessors & currentAf f inity) ? 

thermalLimit : forcedLimit; 
index = (thermalProcessors & currentAf f inity) ? 

thermalLimit Index : forcedLimit Index; 

// 

// Next affinity mask 
// 

currentAf f inity <<= 1; 
// 

// Does this processor support throttling? 
// 

if (pState->PopSetThrottle == NULL) { 

KeLowerlrql ( Oldlrql ) ; 
continue; 

} 

/./ 

// Check processors limit for a change 
// 

if (limit > P State-> opCapabilitiQG . ProcessorMaxThrottle) { 

PoPrint ( 

PO_THERM, 

("PopThrottle: Limit (%d) > Scale (%d)\n", 
limit, 

PopCapabilities . ProcessorMaxThrottle) 
>; 

limit = P State-> opCapabilitics . ProcessorMaxThrottle; 

} else if (limit < P State- 
> opCapabilitic3 . ProcessorMinThrottle) { 

PoPrint ( 

PO_THERM, 

("PopThrottle: Limit (%d) < MinThrottle " 

w (%d)\n", 

limit, 

PopCapabilities . ProcessorMinThrottle) 
) ; 

limit = P State-> opCapQbilitico . ProcessorMinThrottle; 

} 



if (pState->ThermalThrottleLimit != limit) { 

pState->ThermalThrottleLimit = limit; 
pState->ThermalThrottleIndex = index; 

} 

// 

// Revert back to our previous IRQL 
// 

KeLowerlrql ( Oldlrql ) ; 
} while (processors) ; 
KeRevertToUserAf f inityThread ( ) ; 
// 

// Apply thermal throttles if necessary. Note we always do this 
// whether or not the limits were changed. This routine also gets 
// called whenever the system transitions from AC to DC, and that 
// may also require a throttle update due to dynamic throttling. 
// 

PopUpdateAllThrottles () ; 



5.6 PopRoundThrottleQ 



VOID 

PopRoundThrottle ( 

IN UCHAR Throttle, 

OUT OPTIONAL PUCHAR RoundDown, 

OUT OPTIONAL PUCHAR RoundUp, 

OUT OPTIONAL PUCHAR RoundDownlndex, 

OUT OPTIONAL PUCHAR RoundUpIndex 

) 



{ 



KIRQL 
PKPRCB 

PPROCESSOR_PERF_STATE 

PPROCESSOR_POWER_STATE 

UCHAR 

UCHAR 

UCHAR 

UCHAR 

ULONG 



Oldlrql; 
Prcb; 

PerfStates; 

Pstate; 

Low; 

Lowlndex; 
High; 

Highlndex; 
I; 



// 

//We need to get this processor's power capabilities 
// 

Prcb = KeGetCurrentPrcbO ; 
PState = & (Prcb->PowerState) ; 



// 

// Make sure that we are synchronized with the Idle thread 
// and other routines that access these data structures 



// 

KeRaiselrqK DISPATCH_LEVEL, &01dlrql ); 
PerfStates = PState->Perf States; 

// 

// Does this processor support throttling? 
// 

if ( ! (PState-> Flags & PSTATE SUPPORTS THROTTLE) PopSctThrottlc — 
mi) { 

if ( ARGUMENT_PRESENT (RoundUp) ) — f 
*RoundUp - Throttle; 

if ( ARCUMENT_PRE SEN T (RoundUpIndcx) ) — {- 
*RoundUpIndcx - Q; 

if (ARGUMENT_PRESEN T (RoundDown) ) — h 
* RoundDown - Throttle; 

if ( ARGUMENT_PRESENT (RoundDown Index) ) [ 
*RoundDownIndox ~ 0; 

+ 

KcLowcrlrql ( Oldlrq l ) ; 
return ; 

Low = High = Throttle; 
Lowlndex = Highlndex = 0; 
Goto PopRoundThrottleExit; 

} 

// 

// Check if the supplied throttle is out of range 
// 

if (Throttle <= Po p C a p abilitics . State-> ProcessorMinThrottle) { 

Throttle = P State~> opCapabili tics . ProcessorMinThrottle ; 

} else if (Throttle >= P opCapabilitics . State- 
>ProcessorMaxThrottle) { 

Throttle = P State-> opCapabilitico . ProcessorMaxThrottle; 

} 

// 

// Initialize our search space to something reasonable 
// 

Low = High = PerfStates [ 0] . PercentFrequency; 



Lowlndex = Highlndex = 0; 
// 

// Look at all the available perf states 
// 

for ( 1=0; I < PState->Perf StatesCount PopPcrf LcvclCount ; I++) 

if (-fLow > Throttle 

(Pcrf Statco [I] . PGrccntFrcqucncy < Low)) — {- 

if (Perf States [I] . PercentFrequency < low) { 

Low = Perf States [I] . PercentFrequency; 
Lowlndex = I; 

I 

} else if ( Throttle Pcrf States [ I ] . PercentFrequency > Low) { 

if (Perf States [ I ]. PercentFrequency <= Throttle && 
Perf States [ I ]. PercentFrequency > low) { 

Low = Perf States [I] . PercentFrequency ; 
Lowlndex = I; 

I 

} 

if (-fHigh < Throttle) 

(PcrfStatco [I] .PercentFrequency > High)) — \r 

if (Perf States [I] . PercentFrequency > high) { 

High = Perf States [I] ; 
Highlndex = I; 

I 

} else if ( Throttle Pcrf States [I ] . PercentFrequency < High) 

if (Perf States [I] . PercentFrequency Throttle && 
Perf States [I] . PercentFrequency < High) { 

High = &Perf States [I] ; 
Highlndex = I; 

I 

} 

} 

PopRountThrottleExit : 
// 

// Revert back to our previous IRQL . 



\ 



// 

KeLowerlrqK Oldlrql ); 
// 

// Fill in the pointers provided by the caller 
// 

if (ARGUMENT_PRESENT (RoundUp) ) { 
*RoundUp = High; 

if ( ARGUMENT_PRE SENT (RoundUpIndex) ) { 
*RoundUpIndex = Highlndex; 

} 

} 

if (ARGUMENT_PRESENT (RoundDown) ) { 
*RoundDown = Low; 

if (ARGUMENT_PRESENT (RoundDown Index) ) { 
*RoundDownIndex = Lowlndex; 

} 

} 

} 

The changes in this routine include an optimization for dealing with the case where the 
desired throttle is above/below the maximum/minimum. It also now returns the index into 
the Per f States array that correspond with the rounded up and rounded down values. 

5.7 PopCompositeBatteryDeviceHandler() 

This routine is the one that is notified when the total battery remaining level changes. For 
adaptive throttling to work, whenever a new battery notification comes in, we need to 
update the current ThrottleLimitlndex. 

VOID 

PopCompositeBatteryDeviceHandler ( 

IN PDEVICE_OBJECT DeviceOb ject , 

IN PIRP Irp, 

IN PVOID Context 

) 

{ 

<...> 

if (NT_SUCCESS ( Irp->IoStatus . Status) { 
// 

// Handle the completed request 
// 

switch (PopCB. State) { 
<...> 

case PO CB READ STATUS : 



(Policy == &PopDCPolicy) { 



<..> 
// 

// This is kind of silly, but since we 
// want to minimize our synchronization 
// elsewhere, we have to examine every 
// processor's PowerState and update the 
// ThrottleLimitlndex on each. This 
// may eventually be the smart thing to 
// do if not all processors support the 
// same set of states. 
// 

currentAf f inity = 1; 

processors = KeActiveProcessors; 

while (processors) { 

if (! (processors & currentAf f inity) ) { 

currentAf f inity <<= 1; 
continue; 

} 

KeSetSystemThreadAff inity ( 
currentAf f inity 
) ; 

currentAf f inity «= 1; 
// 

//We need to run at DISPATCH_LEVEL to 
// properly synchronize access to these 
// power structures 
// 

KeRaiselrqK DISPATCH_LEVEL, &01dlrql ); 

Prcb = KeGetCurrentPrcb () ; 
PState = & (Prcb->PState) ; 
PerfStates = PState->Perf States; 
PerfStatesCount = 

PState->PerfStatesCount; 
For (I = PState->KneeThrottlIndex; 

I < PerfStatesCount; 

I++) { 

If (Perf State [I] .MinCapacity >= 
PopCB . Status . Capacity) { 

Break; 

} 

} 

PState->ThrottleLimitIdnex = I; 



// 

// We can revert back to our previous 

// IRQL now 

// 

KeLowerlrql ( Oldlrql ); 



KeRevertToUserAf f inityThread ( ) ; 
<..> 

<„.> 
} 



<...> 
} 

<...> 
} 



} 



